Professor Peter Willett

MA (Oxford), MSc (Sheffield), PhD (Sheffield), DSc (Sheffield)

Information School

Professor Emeritus

Professor Peter Willett
Profile picture of Professor Peter Willett
p.willett@sheffield.ac.uk

Full contact details

Professor Peter Willett
Information School
The Wave
2 Whitham Road
Sheffield
S10 2AH
Profile

I obtained an MA in Natural Sciences (Chemistry) from Oxford University, following which I came to the Postgraduate School of Librarianship and Information Science (as it was then called) at the University of Sheffield in 1975 to study for an MSc in Information Science. Following that, a PhD on the indexing of chemical reactions and post-doctoral work on the automatic classification of document databases, I was appointed to a lecturership in 1979. I was awarded a personal chair in 1991 and a DSc in 1997 and spent my entire professional career here in the School, retiring in December 2019 as Professor Emeritus.

Research interests

My research focused principally on the development of novel techniques for a range of important applications in chemoinformatics, but I also made significant contributions to information retrieval and to bibliometrics. Many algorithms originally developed in my research group are embodied in operational chemoinformatics software that is in use throughout the world, with the GOLD, GASP and GALAHAD programs for ligand-protein docking and pharmacophore mapping being widely distributed on a commercial basis.

My research has been reported in over 550 articles, books, chapters, reports etc. that have attracted over 36000 citations in Google Scholar (h-index of 86). I supervised over 70 successful PhD studentships and was awarded 91 research grants and contracts to a total value of ca. £6.5M. Much of this work was supported by industry, the collaborations involving GlaxoSmithKline, Johnson and Johnson, Eli Lilly, Novartis, Pfizer, Syngenta and Unilever inter alia.

Publications

Books

  • Chowdhury G, Gillet V, McLeod J & Willett P (2018) Preface. RIS download Bibtex download

Journal articles

Chapters

Conference proceedings papers

  • Kannas C, Read W, Ruddock N, Fletcher M, Jackson T, Stevens R, Winter J, Willett P & Gillet V (2015) Multiobjective transformation based de novo design: A case study of surfactants. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, Vol. 250 RIS download Bibtex download
  • Willett P (2015) Molecular similarity approaches in chemoinformatics: Early history and bibliometric analysis. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, Vol. 250 RIS download Bibtex download
  • Clough P, Willett P & Lim J (2015) Unfair Means: Use Cases Beyond Plagiarism (pp 229-234) View this article in WRRO RIS download Bibtex download
  • Arif SM, Hert J, Holliday JD, Malim N & Willett P (2009) Enhancing the Effectiveness of Fingerprint-Based Virtual Screening: Use of Turbo Similarity Searching and of Fragment Frequencies of Occurrence.. Pattern Recognition in Bioinformatics, Vol. 5780 (pp 404-414) View this article in WRRO RIS download Bibtex download
  • Gillet VJ & Willett P (2008) CINF 32-Academic-industrial collaboration in chemoinformatics: Experiences from the UK. Abstracts of Papers of the American Chemical Society, Vol. 235 RIS download Bibtex download
  • Gillet VJ, Holliday J & Willett P (2008) CINF 72-Graduate training in chemoinformatics at the University of Sheffield. Abstracts of Papers of the American Chemical Society, Vol. 235 RIS download Bibtex download
  • Pasupa K, Harrison RF & Willett P (2007) Parsimonious kernel fisher discrimination. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 4477 LNCS(PART 1) (pp 531-538) RIS download Bibtex download
  • Hert J, Willett P, Wilton D, Azzaoui K, Jacoby E & Schuffenhauer A (2005) Turbo similarity searching. Abstracts of Papers of the American Chemical Society, Vol. 230 (pp U1020-U1021) RIS download Bibtex download
  • Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E & Schuffenhauer A (2005) Fingerprint based virtual screening using multiple reference structures.. Abstracts of Papers of the American Chemical Society, Vol. 229 (pp U609-U609) RIS download Bibtex download
  • Willett P (2005) Molecular similarity approaches for chemoinformatics.. Abstracts of Papers of the American Chemical Society, Vol. 229 (pp U774-U774) RIS download Bibtex download
  • Holliday JD, Salim N & Willett P (2005) On the magnitudes of coefficient values in the calculation of chemical similarity and dissimilarity. Chemometrics and Chemoinformatics, Vol. 894 (pp 77-95) View this article in WRRO RIS download Bibtex download
  • Willett P (2004) Chemoinformatics: an application domain for information retrieval techniques.. SIGIR (pp 393-393) View this article in WRRO RIS download Bibtex download
  • Willett P & Clark RD (2003) 3D atom-based alignment with hypermolecules.. Abstracts of Papers of the American Chemical Society, Vol. 226 (pp U455-U455) RIS download Bibtex download
  • Bath P, Craigs C, Maheswaran R, Raymond J & Willett P (2002) Use of graph theory for data mining in public health. Management Information Systems, Vol. 6 (pp 819-828) View this article in WRRO RIS download Bibtex download
  • Gaizauskas R, Herring P, Oakes M, Beaulieu M, Willett P, Fowkes H & Jonsson A (2001) Intelligent access to text. Proceedings of the first international conference on Human language technology research - HLT '01, 18 March 2001 - 21 March 2001. RIS download Bibtex download
  • Turner DB & Willett P (2000) EVA QSAR: Development of models with enhanced predictivity (EVA_GA). Molecular Modeling and Prediction of Bioactivity (pp 331-333) RIS download Bibtex download
  • Willett P (1999) IDENTIFICATION OF BIOISOSTERIC MOLECULES USING FIELD-BASED SIMILARITY SEARCHING. ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, Vol. 55 (pp 71-71) RIS download Bibtex download
  • Bayley MJ, Gillet VJ, Willett P, Bradshaw J & Green DVS (1999) Computational analysis of molecular diversity for drug discovery. Proceedings of the Annual International Conference on Computational Molecular Biology, RECOMB (pp 321-330) RIS download Bibtex download
  • Willett P (1999) Matching of chemical and biological structures using subgraph and maximal common subgraph isomorphism algorithms. Rational Drug Design, Vol. 108 (pp 11-38) RIS download Bibtex download
  • Poirrette AR, Artymiuk PJ, Rice DW & Willett P (1998) Comparison of protein surfaces using a genetic algorithm.. J. Comput. Aided Mol. Des., Vol. 12 (pp 557-569) RIS download Bibtex download
  • Ginn CMR, Ranada SS, Willett P & Bradshaw J (1998) The application of data fusion to similarity searching in chemical databases. Fusion'98: Proceedings of the International Conference on Multisource-Multisensor Information Fusion, Vols. 1 and 2 (pp 307-313) RIS download Bibtex download
  • O'Rourke AJ, Robertson AM, Willett P, Eley P & Simons P (1997) Word variant identification in Old French. Information Research, Vol. 2(4) (pp 19-23) RIS download Bibtex download
  • Turner DB, Tyrrell SM & Willett P (1997) Rapid quantification of molecular diversity for selective database acquisition. Journal of Chemical Information and Computer Sciences, Vol. 37(1) (pp 18-22) RIS download Bibtex download
  • Ginn CMR, Turner DB, Willett P, Ferguson AM & Heritage TW (1997) Similarity searching in files of three-dimensional chemical structures: Evaluation of the EVA descriptor and combination of rankings using data fusion. Journal of Chemical Information and Computer Sciences, Vol. 37(1) (pp 23-37) RIS download Bibtex download
  • Ellis D, Furnerhines J & Willett P (1995) Is the manual creation of hypertext worth the effort?. Libraries and Publishers, Vol. 4 (pp 122-138) RIS download Bibtex download
  • Kirriemuir JW & Willett P (1995) Use of cluster analysis methods for analysing the outputs of multiple-database searches. Electronic Library and Visual Information Research - ELVIRA 2 (pp 117-126) RIS download Bibtex download
  • Ellis D, Furner-Hines J & Willett P (1994) On the Measurement of Inter-Linker Consistency and Retrieval Effectiveness in Hypertext Databases.. SIGIR (pp 51-60) RIS download Bibtex download
  • Ellis D, Furnerhines J & Willett P (1994) Measuring the Consistency of Assignment of Hypertext Links in Full-text Documents. Information Retrieval (pp 67-80) RIS download Bibtex download
  • Wild DJ & Will P (1994) Similarity Searching in Files of Three-Dimensional Chemical Structures. Implementation of Atom Mapping on the Distributed Array Processor Dap-610, the Maspar MP-1104, and the Connection Machine CM-200. Journal of Chemical Information and Computer Sciences, Vol. 34(1) (pp 224-231) RIS download Bibtex download
  • Artymiuk PJ, Rice DW, Grindley HM, Poirrette AR, Ujah EC & Willett P (1994) Identification of β-sheet motifs, of ψ-loops, and of patterns of amino acid residues in three-dimensional protein structures using a subgraph-isomorphism algorithm. Journal of Chemical Information and Computer Sciences, Vol. 34(1) (pp 54-62) RIS download Bibtex download
  • Brown RD, Jones G, Willett P & Glen RC (1994) Matching Two-Dimensional Chemical Graphs Using Genetic Algorithms. Journal of Chemical Information and Computer Sciences, Vol. 34(1) (pp 63-70) RIS download Bibtex download
  • Brown RD, Downs GM, Jones G & Willett P (1994) Hyperstructure Model for Chemical Structure Handling: Techniques for Substructure Searching. Journal of Chemical Information and Computer Sciences, Vol. 34(1) (pp 47-53) RIS download Bibtex download
  • (1993) Proceedings of the 16th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Pittsburgh, PA, USA, June 27 - July 1, 1993 RIS download Bibtex download
  • Artymiuk PJ, Rice DW, Grindley HM, Mitchell EM, Ujah EC & Willett P (1993) Representation and Searching of 3-D Protein Structures (pp 273-292) RIS download Bibtex download
  • Jones G, Brown RD, Clark DE, Willett P & Glen RC (1993) Searching Databases of Two-Dimensional and Three-Dimensional Chemical Structures Using Genetic Algorithms.. ICGA (pp 597-602) RIS download Bibtex download
  • Robertson AM & Willett P (1992) Searching for historical word-forms in a database of 17th-century english text using spelling-correction methods. Proceedings of the Fifteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp 256-265) RIS download Bibtex download
  • Robertson AM & Willett P (1992) Searching for Historical Word-Forms in a Database of 17th-Century English Text Using Spelling-Correction Methods.. SIGIR (pp 256-265) RIS download Bibtex download
  • ARTYMIUK PJ, GRINDLEY HM, RICE DW, UJAH EC & WILLETT P (1992) SEARCHING TECHNIQUES FOR THE TERTIARY STRUCTURES OF PROTEINS IN THE PROTEIN DATA-BANK. RECENT ADVANCES IN CHEMICAL INFORMATION, Vol. 100 (pp 91-106) RIS download Bibtex download
  • Al-Hawamdeh S, De Vere R, Smith G & Willett P (1991) Using nearest-neighbour searching techniques to access full-text documents. Online Information Review, Vol. 15(3-4) (pp 173-190) RIS download Bibtex download
  • Downs GM & Willett P (1991) The use of Similarity and Clustering Techniques for the Predictions of Molecular Properties. Applied Multivariate Analysis in SAR and Environmental Studies, Vol. 2 (pp 247-279) RIS download Bibtex download
  • Cringean JK, England R, Manson GA & Willett P (1990) Parallel Text Searching in Serial Files Using a Processor Farm.. SIGIR (pp 429-453) RIS download Bibtex download
  • Croft WB, Lucia TJ, Cringean J & Willett P (1989) Retrieving documents by plausible inference: An experimental study. Information Processing and Management, Vol. 25(6) (pp 599-614) RIS download Bibtex download
  • Rasmussen EM & Willett P (1987) Non-Hierarchic Document Clustering Using the ICL Distributed Array Processor.. SIGIR (pp 132-139) RIS download Bibtex download
  • El-Hamdouchi A & Willett P (1986) Hierarchic Document Clustering Using Ward's Method.. SIGIR (pp 149-156) RIS download Bibtex download
Grants

Research Projects

Open-Access Mega Journals and the Future of Scholarly Communication

Arts and Humanities Research Council Investigator £421,465 2 November 2015 24 months

Open-access 'mega-journals' are an emerging publishing trend which has the potential to reshape the way researchers share their findings, remoulding the academic publishing market and radically changing the nature and reach of scholarship. This project will investigate the influence of mega-journals in the academic community and beyond.

Bio-renewable Formulation - 6 month extension

Unilever Investigator £49,171 1 February 2015 12 months

Bio-renewable Formulation Information and Knowledge Management System

Technology Strategy Board Investigator £24,992 1 April 2014 24 months

Innovative ICT can play a crucial role in many innovation processes, but its potential is not always exploited in many industries. A route to innovation in formulated product industries is the exploitation of materials in what would otherwise be lost to waste streams from current manufacturing processes. This is exciting both in terms of realising additional value from manufacturing, but also in reduced utilisation of unsustainable material sources and exploitation of novel feedstocks for novel functional materials with new application benefits. This project will develop an information system based on highly innovative information technologies with the capability to rapidly identify the feedstock and functional material opportunities for formulated products, and demonstrate its value in rapid bio-derived surfactant discovery. It aims to support chemical using industries where environmental impact, sustainability and materials security are increasingly significant drivers of innovation alongside improved performance in formulated products. Project partners are Unilever, British Sugar, Croda, Cybula, University of Manchester and University of Liverpool

N8 Biohub Information and Knowledge Management System

Technology Strategy Board Investigator £131,128 1 October 2013 28 months

The overall aim of this project is to build, and demonstrate the value of, an information system (IS) to support the creation of a "Bio-Hub" centred on the N8 university group. The IS will demonstrate how functional ingredients from simple transformations of sustainable plant & waste feedstocks can be identified more quickly and recommend the best feedstocks for a particular function. It will address two big data problems using clever algorithms: semantic extraction of the available domain literature (terabytes) and optimised global search algorithms to explore the combinatorially large number of transformation products (up to petabytes). The innovations are in the creation of robust enough algorithms to run semi-autonomously in an information system and in bringing these together with all the other components. The value will be demonstrated for specific feedstocks and applications, but the ICTs will be selected for simple extension to, and maintenance of, the overall information domain. Project partners are Unilever, British Sugar, Croda, Cybula and University of Manchester.

AstraZeneca Collaboration - Pharmacophores

AstraZeneca Investigator £74,512 1 January 2007 12 months

The project involved the development of a new multiobjective optimisation method for pharmacophore identification from sets of active compounds. A pharmacophore describes the three-dimensional arrangement of chemical features required for a small molecule to bind to a receptor and the aim of this project was to deduce the pharmacophore from a series of active compounds in the absence of the structure of the receptor itself. This involves superposing the compounds so that their common features are overlaid.

Sanofi-Sheffield collaboration

Sanofi-Aventis Principal Investigator £92,421 1 January 2007 12 months

Array design for lead optimisation in pharmaceutical research

GlaxoSmithKline Investigator £252,000 23 October 2006 48 months

This EPSRC-funded project focused on the development of tools to assist medicinal chemists in the design of compound arrays during the lead optimisation stage of drug discovery. Lead optimisation is a complex, time-consuming task, in which chemists seek to obtain a promising balance among potency, off-target interactions, toxicity, and pharmacokinetic behaviour, to identify a candidate molecule to progress to clinical trials. The focus has been on inverse QSAR, that is, determining the structural change necessary to achieve a desired change in property. This was been approached through retrospective studies of lead optimisation projects within the GSK archive and the development of computational tools that can be applied in prospective array design to inform decision making by chemists. These included a novel context-sensitive approach to matched molecular-pairs analysis.

Vector Analysis of 2D Fingerprints and Screening Data

Xention Limited Principal Investigator £13,000 1 July 2005 2 months

Support tools for automatic pharmacophore generation

Pfizer Investigator £71,467 1 March 2004 22 months

Johnson and Johnson PhD

Janssen Pharmaceuticals N.V. Principal Investigator £76,305 1 January 2004 36 months

Richmond Continuation

Tripos Principal Investigator £14,800 1 January 2004 3 months

Mining molecular bioassay data

Pfizer Principal Investigator £125,016 1 January 2003 24 months

Cheminformatics methods for HTS and profiling data analysis

Novartis Principal Investigator £50,001 1 December 2002 36 months

PhD studentship

Development of novel methods for protein surface representation and comparison

Biotechnology and Biological Sciences Research Council Investigator £151,376 1 December 2001 24 months

Generation of 3D Hyperstructures

Tripos Principal Investigator £106,000 1 December 2001 24 months

Use of graph-theoretical methods in computation chemistry for pattern identification

Medical Research Council Investigator £49,392 1 May 2001 36 months

Probabilistic prediction of bioactivity

Zeneca Pharmaceuticals Principal Investigator £170,251 1 March 2001 36 months

Genome Analysis Using DNA Structure

Biotechnology and Biological Sciences Research Council Investigator £134,164 1 January 2001 36 months

Discrete Mathematical Approaches to Chemical Information Retrieval

Parke Davis Neuroscience Research Centre Principal Investigator £31,396 1 October 2000 36 months
Professional activities and memberships
  • Recipient of the Skolnik Award of the American Chemical Society (1993), of the Distinguished Lecturer Award of the New Jersey Chapter of the American Society for Information Science (1997), of the Tony Kent Strix Award of the Institute of Information Scientists (2001), of the Lynch Award of the Chemical Structure Association Trust (2002) of the American Chemical Society Award for Computers in Chemical and Pharmaceutical Research (2005), of the Patterson-Crane Award of the American Chemical Society (2010), and of the Jason Farradane Award of the UK e-Information Group (2012).
  • Included in Who’s Who in Science and Engineering (1995-) and Who’s Who (2004-).

Editorial board membership

  • Journal of Documentation
  • MATCH Communications in Mathematical and in Computer Chemistry

Journal and conference reviewing

A huge range of journals, most recently: Aslib Journal of Information Management; Educational Research and Reviews; Global Knowledge, Memory and Communication; Information Research; Journal of Chemical Information and Modeling; Journal of Documentation; Journal of Organic Chemistry,;MATCH Communications in Mathematical and in Computer Chemistry; Performance Measurement and Metrics