Dr Xiaorui Jiang
Information School
Lecturer in Data Science

Full contact details
Information School
Room C228
The Wave
2 Whitham Road
S10 2AH
- Profile
My first official academic appointment started as a Lecturer in Intelligent Systems in the School of Information Engineering, Zhejiang University of Technology (ZJUT), China in 2013, where I was responsible for deputy-leading a big data visual analytics team for a series of smart city R&D projects in China. I assisted the head of the team in securing more than 3.5M government funding and 5M paired industrial investment (both in CNY) in intelligently. Apart from the leadership, I was allowed to enjoy a portion of relatively quiet time in my personal scientific pursuit in scientific literature and network analysis, which began during my PhD studies at the Chinese Academy of Sciences. In this direction, I was funded by the National and Provincial Natural Science Foundations on heterogeneous academic network analysis. In addition to teaching on undergraduate modules about programming languages and data structures, I was also involved in creating a series of new modules about big data visualization and visual analytics for both undergraduate and postgraduate courses. In 2015, I was offered the Excellent Researcher Award of Zhejiang University of Technology.
Since then, my entire academic career can be described as centred around data science education and research. In 2016, I moved to the Department of Computer Science in Aston University, where I created the Advanced DataBase Systems module and made significant contributions to the Rolls Royce Data Science Programme and the degree apprenticeship programmes. I also worked on a few projects which focused on helping SMEs to build or grow their business using data science concepts and techniques. Since Dec 2019, I started to focus more on Natural Language Processing (NLP) when shortly working as a Senior Researcher in NLP at the National Institute of Information and Communication Technologies in Japan. In September 2020, I joined the Centre for Computational Sciences and Mathematical Modelling at Coventry University, where I lead cross-disciplinary research in NLP and created the Natural Language Processing module. In 2023 I was nominated for the Outstanding Supervisor Award, and in 2024 I won the Academic Staff of the Year Award of the College of Engineering, Environment and Science, for the recognition of my contributions to improving student experience.
In May 2023, I joined the Information School of the University of Sheffield as a Lecturer in Data Science. I will teach on both undergraduate and postgraduate data science programmes. Recently, my main research mainly falls into three major areas which either align to or enhance the existing research groups in the Information School: (1) NLP and Network Analysis for Science and Academia (e.g., scientific text mining, scientometrics, metascience, research on research), which I expect to expand and grow significantly through collaboration in the Information School; (2) NLP for Healthcare/Biomedical Informatics (e.g., evidence-based medicine, NLP solutions for primary care), which I expect to establish cross-department collaboration within the university of Sheffield; (3) NLP and Network Analysis for Law, a field I am gradually developing in collaboration with the Law and Tech Lab in Maastricht University, the Netherlands, initially funded by the Royal Society International Exchange Scheme. In addition to the above, I am open to, have been worked on or been working on NLP applications in digital humanities, media, and business, among others (e.g., historical/literary text mining, multimodal sport video analysis, technology acceptance analysis, and so on).
- Qualifications
Bachelor/Master of Engineering in Computer Science, Harbin Institute of Technology, China, 2005/2007
Doctor of Engineering in Computer Science, Institute of Computing Technology, Chinese Academy of Sciences, China, 2013
Postgraduate Certificate in Learning and Teaching in Higher Education, Aston University, UK, 2018
Fellow of the Higher Education Academy, 2019
- Research interests
I am interested in supervising PhD students in the following areas:
- Natural language processing and machine learning for mining biomedical publications and (semi-)automating systematic literature review
- Natural language processing and network analytics for understanding science (scientific text mining and citation network analysis) and improving scientific research (meta-science or research on research)
- Natural language processing and machine learning for improving the healthcare services in the primary and secondary care settings, e.g., clinical text classification, clinical information extraction, algorithmic clinical coding, etc.
- Natural language processing and network analytics for legal informatics applications, such as legal citation network analysis and text mining of legal documents (cases, contracts, T&C, privacy notices)
- Big data analytics and (multi-modal) natural language processing for solving real-world problems of social sciences, humanities and arts, such as digital inclusiveness, digital vulnerability, digital humanities, AI-assisted survey, multi-modal sport video summarisation, and so on.
- Publications
Journal articles
- Ensembling approaches to citation function classification and important citation screening. Scientometrics. View this article in WRRO
- A question-answering framework for automated abstract screening using large language models.. J Am Med Inform Assoc.
- Contextualised segment-wise citation function classification. Scientometrics, 128(9), 5117-5158.
- Extracting the evolutionary backbone of scientific domains: The semantic main path network analysis approach based on citation context analysis. Journal of the Association for Information Science and Technology, 74(5), 546-569.
- Main path analysis on cyclic citation networks. Journal of the Association for Information Science and Technology, 71(5), 578-595.
- Forward search path count as an alternative indirect citation impact indicator. Journal of Informetrics, 13(4), 100977-100977.
- DiffusionInsighter: Visual Analysis of Traffic Diffusion Flow Patterns. Chinese Journal of Electronics, 27(5), 942-950.
- Exploiting heterogeneous scientific literature networks to combat ranking bias: Evidence from the computational linguistics area. Journal of the Association for Information Science and Technology, 67(7), 1679-1702.
- Large-scale taxi O/D visual analytics for understanding metropolitan human movement patterns. Journal of Visualization, 18(2), 185-200.
- Scaling Hop-Based Reachability Indexing for Fast Graph Pattern Query Processing. IEEE Transactions on Knowledge and Data Engineering, 26(11), 2803-2817.
- Graph-based algorithms for ranking researchers: not all swans are white!. Scientometrics, 96(3), 743-759.
- Adding logical operators to tree pattern queries on graph-structured data. Proceedings of the VLDB Endowment, 5(8), 728-739.
- Effective Natural Language Processing Algorithms for Early Alerts of Gout Flares from Chief Complaints. Forecasting, 6(1), 224-238.
- An Empirical Study of Span Modeling in Science NER, Linking Theory and Practice of Digital Libraries (pp. 41-48). Springer International Publishing
Conference proceedings papers
- Contextualised Modelling for Effective Citation Function Classification. Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval
- Ranking Scientific Articles in a Dynamically Evolving Citation Network. 2016 12th International Conference on Semantics, Knowledge and Grids (SKG), 15 August 2016 - 17 August 2016.
- Ranking Scientific Articles in a Dynamically Evolving Citation Network. PROCEEDINGS OF 2016 12TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG) (pp 154-157)
- Semantics in Deep Neural-Network Computing. 2015 11th International Conference on Semantics, Knowledge and Grids (SKG), 19 August 2015 - 21 August 2015.
- Ranking Scientific Articles over Heterogeneous Academic Network. 2015 11th International Conference on Semantics, Knowledge and Grids (SKG), 19 August 2015 - 21 August 2015.
- 3-Hop: A Novel Hop-Based Reachability Index. 2013 Ninth International Conference on Semantics, Knowledge and Grids, 3 October 2013 - 4 October 2013.
- Towards an effective and unbiased ranking of scientific literature through mutual reinforcement. Proceedings of the 21st ACM international conference on Information and knowledge management
- DSMS in ubiquitous-healthcare: A Borealis-based heart rate variability monitor. 2011 4th International Conference on Biomedical Engineering and Informatics (BMEI), 15 October 2011 - 17 October 2011.
- A Resource Space Model for Dataspace. 2010 Sixth International Conference on Semantics, Knowledge and Grids, 1 November 2010 - 3 November 2010.
- The Interactive Computing Model for Cyber Physical Society. 2010 Sixth International Conference on Semantics, Knowledge and Grids, 1 November 2010 - 3 November 2010.
- Doppler blood flow signal analysis meets traditional Chinese pulse diagnosis. 2010 3rd International Conference on Biomedical Engineering and Informatics, 16 October 2010 - 18 October 2010.
- Distinguishing Patients with Gastritis and Cholecystitis from the Healthy by Analyzing Wrist Radial Arterial Doppler Blood Flow Signals. 2010 20th International Conference on Pattern Recognition, 23 August 2010 - 26 August 2010.
- Ensembling approaches to citation function classification and important citation screening. Scientometrics. View this article in WRRO
- Teaching interests
I am interested in teaching the following subjects: Natural language processing, text mining, machine learning, data mining and databases, data structure and algorithms, or other data science-related modules.
I am also keen on supervising postgraduate dissertation projects in machine learning for text and scientometric/bibliometric analysis, wishing to collaborate with motivated PG students to embark on their journeys into interesting topics of research.
- Teaching activities
INF217, Databases and beyond, Data Science BSc, Module coordinator
INF216, Responsible Data Science Lab 1, Data Science BSc, Module contributor
INF319, Responsible Data Science Lab 2
INF6050, Database Design
- Professional activities and memberships
I am a lifetime member of the International Society of Scientometrics and Informetrics, a member of the Association for Computational Linguistics
I am an active reviewer for the Journal of the Association for Information Science and Technology, Journal of Informetrics, and Scientometrics