Dr Denis Newman-Griffis (they/them)
BA (Carleton), MSc (Ohio State), PhD (Ohio State)
Information School
Lecturer in Data Science
+44 114 222 2647
Full contact details
Information School
Room C224
The Wave
2 Whitham Road
Sheffield
S10 2AH
- Profile
-
I am a data scientist by way of computer science, computational linguistics, and health informatics. I completed my undergraduate degree in Computer Science and Russian, then worked as a business software developer for two years before completing postgraduate training in Computer Science and Engineering, working with the National Institutes of Health Clinical Center on developing natural language processing (NLP) methods to support and inform the U.S. Social Security Administration's disability benefits programmes. I completed postdoctoral training in the Department of Biomedical Informatics at the University of Pittsburgh before joining the Information School as a Lecturer in Data Science in 2022.
My research explores equitable AI and data science for human well-being, including intersections of data and disability, health NLP, and policy and practice of responsible AI. I have published extensively across these topics in computer science, health informatics, and social science venues, and am leading funded projects in responsible AI practice and disability informatics. I organise the workshop series on AI for Function, Disability, and Health, and received the American Medical Informatics Association (AMIA) Doctoral Dissertation Award for my work on NLP and disability.
I am a member of the UK Young Academy and currently serve as its Co-Chair. I am proud to be a queer researcher in data science, and I serve as a Non-Binary Role Model at the University of Sheffield. I have previously been involved in organising events through Queer in AI, and I am passionate about supporting LGBT+/queer students and staff in the academic community.
University responsibilities
- Deputy Programme Coordinator, BSc Data Science (2023-)
- Interim Programme Coordinator, BSc Data Science (Semester 2, 2023-2024)
- Deputy Programme Coordinator, MSc Data Science (2022-2023)
- Member of the University Task & Finish Group on Generative AI (2022-2023)
- Member of the University Academic and Student Product Board (2023-)
- Research interests
-
My research investigates better ways to connect people with data-driven insights using artificial intelligence. I approach this in a highly interdisciplinary way, drawing on AI and data science, critical disability studies, health informatics, critical data studies and linguistics.
My recent research generally follows three themes:
Responsible AI practices: I am leading the Research on Research Institute’s GRAIL project on responsible use of AI and machine learning in research funding and evaluation, supported by an international collaboration of research funding agencies.
Data, AI, and disability: I work on developing new NLP approaches to analyse information about function and disability experience for health and well-being, as well as critically analysing AI technology design and implementation from disability-centred perspectives. I am supervising a PhD student (Jun Wang) investigating information needs for disability-centred care.
Practical health NLP: I work on improving generalisability of methods for extracting health information from text, and developing new approaches for evaluating real-world impact of health NLP.
I am interested in supervising PhD projects across these areas, as well as in research on effective data science education. I primarily use quantitative methods, particularly statistical analysis and machine learning, but my work includes survey-based and qualitative methods as well.
- Publications
-
Journal articles
- Editorial: Artificial intelligence for human function and disability. Frontiers in Digital Health, 5. View this article in WRRO
- Definition drives design: Disability models and mechanisms of bias in AI technologies. First Monday, 28(1).
- A roadmap to reduce information inequities in disability with digital health and natural language processing. PLOS Digital Health, 1(11).
- Information extraction framework for disability determination using a mental functioning use-case. JMIR Medical Informatics, 10(3). View this article in WRRO
- Digital scarlet letters: sexually transmitted infections in the electronic medical record. Sexually Transmitted Diseases, 49(6), e70-e74.
- Improving broad-coverage medical entity linking with semantic type prediction and large-scale datasets. Journal of Biomedical Informatics, 121.
- Linking free text documentation of functioning and disability to the ICF with natural language processing. Frontiers in Rehabilitation Sciences, 2. View this article in WRRO
- Automated coding of under-studied medical concept domains: linking physical activity reports to the international classification of functioning, disability, and health. Frontiers in Digital Health, 3. View this article in WRRO
- A comprehensive study of mobility functioning information in clinical notes: Entity hierarchy, corpus annotation, and sequence labeling. International Journal of Medical Informatics, 147. View this article in WRRO
- Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets. Journal of the American Medical Informatics Association, 28(3), 516-532. View this article in WRRO
- Broadening horizons: the case for capturing function and the role of health informatics in its use. BMC Public Health, 19. View this article in WRRO
- Representation of child and youth participation within the Unified Medical Language System (UMLS). Disability and Rehabilitation, 1-6.
Chapters
- Configuring Data Subjects, Dialogues in Data Power (pp. 10-30). Bristol University Press
- Big Data for Big Investments, Artificial Intelligence and Evaluation (pp. 120-143). Routledge
Conference proceedings papers
- View this article in WRRO Half the picture: word frequencies reveal racial differences in clinical documentation, but not their causes. Proceedings of the 2022 AMIA Informatics Summit (pp 386-395). Chicago, Illinois, USA, 21 March 2022 - 21 March 2022.
- Robust knowledge graph completion with stacked convolutions and a student re-ranking network. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp 1016-1029). Online, 1 August 2021 - 1 August 2021. View this article in WRRO
- Preface. Artificial Intelligence for Function, Disability, and Health 2021 (AI4Function 2021), Vol. 2926. Online, 20 August 2021 - 21 August 2021.
- Translational NLP: a new paradigm and general principles for natural language processing research. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp 4125-4138). Online, 6 June 2021 - 6 June 2021. View this article in WRRO
- TextEssence: a tool for interactive analysis of semantic shifts between corpora. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations (pp 106-115). Online, 6 June 2021 - 6 June 2021. View this article in WRRO
- Introducing information retrieval for biomedical informatics students. Proceedings of the Fifth Workshop on Teaching NLP (pp 96-98). Online, 10 June 2021 - 10 June 2021. View this article in WRRO
- View this article in WRRO Development of natural language processing tools to support determination of federal disability benefits in the U.S.. Proceedings of the 1st Workshop on Language Technologies for Government and Public Administration (LT4Gov) (pp 1-6). Marseille, France, 11 May 2020 - 11 May 2020.
- Parallel data-local training for optimizing Word2Vec embeddings for word and graph embeddings. 2019 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC) (pp 44-55). Denver, CO, USA, 18 November 2019 - 18 November 2019. View this article in WRRO
- HARE: a flexible highlighting annotator for ranking and exploration. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations (pp 85-90). Hong Kong, China, 3 November 2019 - 3 November 2019. View this article in WRRO
- Writing habits and telltale neighbors: analyzing clinical concept usage patterns with sublanguage embeddings. Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019) (pp 146-156). Hong Kong, China, 3 November 2019 - 3 November 2019. View this article in WRRO
- Characterizing the impact of geometric properties of word embeddings on task performance. Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for (pp 8-17). Minneapolis, USA, 6 June 2019 - 6 June 2019. View this article in WRRO
- Classifying the reported ability in clinical mobility descriptions. Proceedings of the 18th BioNLP Workshop and Shared Task (pp 1-10). Florence, Italy, 1 August 2019 - 1 August 2019. View this article in WRRO
- Jointly embedding entities and text with distant supervision. Proceedings of The Third Workshop on Representation Learning for NLP (pp 195-206). Melbourne, Australia, 20 July 2018 - 20 July 2018. View this article in WRRO
- Embedding transfer for low-resource medical named entity recognition: a case study on patient mobility. Proceedings of the Biomedical Natural Language Processing 2018 workshop (BioNLP 2018) (pp 1-11). Melbourne, Australia, 19 July 2018 - 19 July 2018. View this article in WRRO
- Inductive identification of functional status information and establishing a gold standard corpus: a case study on the mobility domain. 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp 2319-2321). Kansas City, MO, USA, 13 November 2017 - 13 November 2017. View this article in WRRO
- Insights into analogy completion from the biomedical domain. Proceedings of the 16th BioNLP Workshop (BioNLP 2017) (pp 19-28). Vancouver, Canada, 4 August 2017 - 4 August 2017. View this article in WRRO
- View this article in WRRO A quantitative and qualitative evaluation of sentence boundary detection for the clinical domain. Proceedings of the 2016 Summit on Translational Bioinformatics, Vol. 2016 (pp 88-97). San Francisco, CA, United States, 21 March 2016 - 21 March 2016.
Working papers
- Good practice in the use of machine learning & AI by research funding organisations: insights from a workshop series.
Preprints
- Disability data futures: Achievable imaginaries for AI and disability data justice.
- AI Thinking: A framework for rethinking artificial intelligence in practice, Center for Open Science.
- A roadmap to reduce information inequities in disability with digital health and natural language processing, Center for Open Science.
- Definition drives design: Disability models and mechanisms of bias in AI technologies, arXiv.
- Half the picture: Word frequencies reveal racial differences in clinical documentation, but not their causes, Cold Spring Harbor Laboratory.
- Linking Free Text Documentation of Functioning and Disability to the ICF with Natural Language Processing, Cold Spring Harbor Laboratory.
- Information Extraction Framework for Disability Determination Using a Mental Functioning Use-Case (Preprint), JMIR Publications Inc..
- Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network, arXiv.
- Introducing Information Retrieval for Biomedical Informatics Students, arXiv.
- Translational NLP: A New Paradigm and General Principles for Natural Language Processing Research, arXiv.
- TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between Corpora, arXiv.
- Automated Coding of Under-Studied Medical Concept Domains: Linking Physical Activity Reports to the International Classification of Functioning, Disability, and Health, arXiv.
- Improving Broad-Coverage Medical Entity Linking with Semantic Type Prediction and Large-Scale Datasets, arXiv.
- Writing habits and telltale neighbors: analyzing clinical concept usage patterns with sublanguage embeddings, arXiv.
- HARE: a Flexible Highlighting Annotator for Ranking and Exploration, arXiv.
- Classifying the reported ability in clinical mobility descriptions, arXiv.
- Characterizing the impact of geometric properties of word embeddings on task performance, arXiv.
- Jointly Embedding Entities and Text with Distant Supervision, arXiv.
- Embedding Transfer for Low-Resource Medical Named Entity Recognition: A Case Study on Patient Mobility, arXiv.
- Insights into Analogy Completion from the Biomedical Domain, arXiv.
- Second-Order Word Embeddings from Nearest Neighbor Topological Features, arXiv.
- Research group
-
I am currently supervising the following PhD students:
- Ian Widdows: Secondary school accountability measures in England - their effectiveness, effects and an exploration of alternative approaches. (With Jo Bates)
- Jun Wang: Understanding information needs in person-centred care for age-related disease, multimorbidity, and disability. (Fully-funded Healthy Lifespan Institute PhD studentship; with Peter Bath and Steven Ariss)
- Yi Jiang (visiting PhD student September 2023-March 2024): Semantically-enriched keyphrase generation for scientific papers. (With Mike Thelwall)
I have supervised 10 MSc students to date and strongly support MSc students in pursuing publishable dissertation research.
- Grants
-
FRAIM: Framing Responsible AI Implementation & Management
AHRC / BRAID
Project Lead
£286,887
1 February 2024
6 months
The FRAIM project brings together cross-sector perspectives on organisational RAI policy and process to scope key stakeholders, shared values, and actionable research needs for building the evidence base on implementing and managing RAI.
GRAIL: Getting Responsible about AI and Machine Learning in Research Funding and Evaluation
Research on Research Institute
Project Lead
£77,000
1 April 2023
24 months
The GRAIL project is exploring good principles and practices for using AI and machine learning in the research funding ecosystem in ways that are both ethical and effective.
Faculty of Social Science Education Fund: Student Voice in MSc in Data Science
TUoS Faculty of Social Science
Principal Investigator
£3,988
21 April 2023
3 months
Programme-level review of student feedback for MSc in Data Science course, covering academic years 2017-2018 to 2021-2022.
- Teaching interests
-
I currently co-coordinate the BSc in Data Science (with Dr. Morgan Harvey). I helped lead development of Level 1 modules, including a novel integrated programme week to introduce students to the Data Science curriculum, and am helping lead development of Level 2 modules. I designed the Level 1 module INF111 Practical Programming for Data Science 1 (first taught 2023-2024), including development of module assessments and a joint assessment with INF112.
I was formerly Deputy Programme Coordinator for the MSc in Data Science and actively teach on the course, primarily focusing on data mining and AI skills as well as use of big data platforms. I led a funded project reviewing five years of student feedback on the course and am contributing to ongoing curriculum enhancement efforts.
I am highly interested in developing and researching effective data science pedagogy, and I am committed to bringing evidence-based best practices into my teaching. I use active learning methodologies frequently in the classroom and regularly engage with student feedback to adapt and improve my teaching. I am particularly interested in working with students to develop effective, practice-focused interventions to improve synthesis of data science skills.
- Teaching activities
-
Module Coordinator:
- INF111 Practical Programming for Data Science 1
- INF216 Responsible Data Science Lab 1
I further contribute teaching to:
- INF6027 Introduction to Data Science
- INF6032 Big Data Analytics
- Professional activities and memberships
-
I am an active member of:
- American Medical Informatics Association (since 2016)
- Association for Computational Linguistics (since 2017)
- British Computer Society (since 2022)
I am Co-Chair and Executive Group member of the UK Young Academy, and a member of the Young Academy’s 2023 cohort.
I am also a Special Volunteer with the Epidemiology & Biostatistics Section of the U.S. National Institutes of Health Clinical Center, on a project with the U.S. Social Security Administration to develop informatics methods for supporting disability benefits determination.
I am active on the DEI Committee of the American Medical Informatics Association and regularly serve as a Scientific Program Committee member for AMIA conferences.
I regularly serve as a programme committee member and reviewer for a variety of conferences and journals in natural language processing and health informatics, including Association for Computational Linguistics (ACL) conferences (ACL, NAACL, EMNLP, EACL), American Medical Informatics Association (AMIA) conferences (Annual Symposium, Informatics Summit), Association for the Advancement of Artificial Intelligence (AAAI), Journal of the American Medical Informatics Association (JAMIA), Journal of Biomedical Informatics, Frontiers in Digital Health, and BMC Medical Informatics and Decision Making.
I am strongly committed to developing and mentoring student researchers, and I regularly serve as a reviewer for ACL series Student Research Workshops. I also served on the AMIA Student Paper Competition Committee (2021-2023).
I founded and organise the Workshop Series on Artificial Intelligence for Function, Disability, and Health (AI4Function) since 2020. I also guest edited a Research Topic on AI4Function in Frontiers in Digital Health.