Dr Diana Maynard
School of Computer Science
Senior Research Fellow
Deputy Head of the Natural Language Processing research group
d.maynard@sheffield.ac.uk
+44 114 222 1938
+44 114 222 1938
Regent Court (DCS)
Full contact details
Dr Diana Maynard
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
- Research interests
-
- Information extraction
- GATE
- Social media analysis
- Sentiment analysis
- Online abuse and misinformation detection
- Term recognition
- Ontologies and semantic web
- Freedom of the media
- NLP for scientometrics
- Publications
-
Books
- The Chilling: A global study of online violence against women journalists. ICFJ.
- Natural Language Processing for the Semantic Web. Springer International Publishing.
- Natural Language Processing for the Semantic Web. Morgan & Claypool Publishers.
- Text Processing with Gate (Version 6). GATE.
- Preface.
- Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface.
- Preface.
- Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface.
Journal articles
- Similarity-Aware Multimodal Prompt Learning for fake news detection. Information Sciences, 647, 119446-119446.
- Using natural language processing and artificial intelligence to explore the nutrition and sustainability of recipes and food. Frontiers in Artificial Intelligence, 3. View this article in WRRO
- Classification aware neural topic model for COVID-19 disinformation categorisation. PLoS ONE, 16(2). View this article in WRRO
- Using ontologies to map between research data and policymakers’ presumptions: the experience of the KNOWMAK project. Scientometrics. View this article in WRRO
- Strengthening the Monitoring of Violations against Journalists through an Events-Based Methodology. Media and Communication, 8(1), 89-100. View this article in WRRO
- What matters most to people around the world? Retrieving Better Life Index priorities on Twitter. Technological Forecasting and Social Change. View this article in WRRO
- View this article in WRRO Pro-Environmental Campaigns via Social Media: Analysing Awareness and Behaviour Patterns.. Journal of Web Science, 3(1), 1-15.
- A framework for real-time semantic social media analysis. Journal of Web Semantics, 44, 75-88. View this article in WRRO
- Distantly supervised Web relation extraction for knowledge base population. Semantic Web, 7(4), 335-349. View this article in WRRO
- Interlinking Documents Based on Semantic Graphs with an Application, 139-155.
- Entity-Based Opinion Mining from Text and Multimedia, 65-86.
- Analysis of named entity recognition and linking for tweets. Information Processing & Management, 51(2), 32-49. View this article in WRRO
- The semantic web challenge 2012. Journal of Web Semantics, 24, 1-2.
- Analysing and Enriching Focused Semantic Web Archives for Parliament Applications. Future Internet, 6(3), 433-456.
- Automatic detection of political opinions in tweets. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7117 LNCS, 88-99.
- The Semantic Web Challenge, 2011. Journal of Web Semantics.
- The semantic web challenge, 2010. Journal of Web Semantics, 9(3), 315.
- Automatic detection of political opinions in tweets. CEUR Workshop Proceedings, 718, 81-92.
- Using lexico-syntactic ontology design patterns for ontology creation and population. CEUR Workshop Proceedings, 516, 39-52.
- NLP-based support for ontology lifecycle development. CEUR Workshop Proceedings, 514.
- Information extraction: Algorithms and prospects in a retrieval context. COMPUT LINGUIST, 34(2), 315-317.
- NLP techniques for term extraction and ontology population. Frontiers in Artificial Intelligence and Applications, 167(1), 107-127.
- REASE - The repository for learning units about the Semantic Web. New Review of Hypermedia and Multimedia, 13(2), 211-237.
- Preface.. IBM Syst. J., 45, 3-6.
- Evolving GATE to meet new challenges in language engineering. Natural Language Engineering, 10(3-4), 349-373.
- Corpus Linguistics and South Asian Languages: Corpus Creation and Tool Development. Literary and Linguistic Computing, 19(4), 509-524.
- Architectural Elements of Language Engineering Robustness. Journal of Natural Language Engineering, 8(2-3), 257-274.
- TRUCKS: A Model for Automatic Multi-Word Term Recognition.. Journal of Natural Language Processing, 8(1), 101-125.
- Similarity-Aware Multimodal Prompt Learning for Fake News Detection. SSRN Electronic Journal.
- Should I Care about Your Opinion? Detection of Opinion Interestingness and Dynamics in Social Media. Future Internet, 6(3), 457-481.
Chapters
- Language Report English, European Language Equality (pp. 127-130). Springer International Publishing
- Preface (pp. V-VII).
- Challenges in Analysing Social Media. In Dusa A, Nelle D, Stock G & Wagner G (Ed.), Facing the Future: European Research Infrastructures for the Humanities and Social Sciences Berlin: SCIVERO Verlag.
- Natural language processing, Perspectives on Ontology Learning (pp. 51-67).
- Documenting Contemporary Society by Preserving Relevant Information from Twitter In Weller K, Bruns A, Burgess J, Mahrt M & Puschmann C (Ed.), Twitter and Society USA: Peter Lang.
Conference proceedings papers
- Dimensions of Online Conflict: Towards Modeling Agonism. Findings of the Association for Computational Linguistics: EMNLP 2023, December 2023 - December 2023.
- Development of a Benchmark Corpus to Support Entity Recognition in Job Descriptions. 2022 Language Resources and Evaluation Conference, LREC 2022 (pp 1201-1208)
- Team Bertha von Suttner at SemEval-2019 Task 4: Hyperpartisan News Detection using ELMo Sentence Representation Convolutional Network. Proceedings of the 13th International Workshop on Semantic Evaluation, June 2019 - June 2019.
- View this article in WRRO Using ontologies to map between research and policy data: opportunities and challenges. Proceedings of the 17th International Conference on Scientometrics & Informetrics, Vol. 1 (pp 535-540). Rome, Italy, 2 September 2019 - 5 September 2019.
- View this article in WRRO Team Bertha von Suttner at SemEval-2019 Task 4: Hyperpartisan News Detection using ELMo Sentence Representation Convolutional Network. Proceedings of the 13th International Workshop on Semantic Evaluation. Minneapolis, Minnesota, USA, 6 June 2019 - 7 June 2019.
- View this article in WRRO Exploring knowledge production in Europe. The KNOWMAK tool. 17th International Conference on Scientometrics and Informetrics, ISSI 2019 - Proceedings, Vol. 2 (pp 2561-2562)
- Adapted TextRank for Term Extraction: A generic method of improving automatic term extraction algorithms. Procedia Computer Science, Vol. 137 (pp 102-108), 10 September 2018 - 13 September 2018. View this article in WRRO
- Cross-Lingual Classification of Crisis Data. The Semantic Web – ISWC 2018, Vol. 11136 (pp 617-633). Monterey, CA, USA,, 8 October 2018 - 12 October 2018. View this article in WRRO
- View this article in WRRO Helping crisis responders find the informative needle in the tweet haystack. Proceedings of the 15th ISCRAM Conference (pp 649-662). Rochester, NY, USA, 20 May 2018 - 23 May 2018.
- View this article in WRRO Twits, twats and twaddle: Trends in online abuse towards UK politicians. 12th International AAAI Conference on Web and Social Media, ICWSM 2018 (pp 600-603)
- View this article in WRRO Ontologies as bridges between data sources and user queries: the KNOWMAK project experience.. Proc. of STI 2017
- Comparing Attitudes to Climate Change in the Media using sentiment analysis based on Latent Dirichlet Allocation. Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism, September 2017 - September 2017.
- View this article in WRRO Towards an Infrastructure for Understanding and Interlinking Knowledge Co-Creation in European research. Proceedings of ESWC 2017 Workshop on Scientometrics. Portoroz, 28 May 2017 - 1 June 2017.
- The Semantic Web View this article in WRRO
- GATE-time: Extraction of temporal expressions and events. Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016 (pp 3702-3708)
- Talking climate change via social media. Proceedings of the 8th ACM Conference on Web Science - WebSci '16, 22 May 2016 - 25 May 2016.
- View this article in WRRO Challenges of Evaluating Sentiment Analysis Tools on Social Media. Proceedings of the Tenth International Conference on Language Resources and Evaluation, 23 May 2016 - 28 May 2016.
- Automated Content Analysis. Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication - IMCOM '16, 4 January 2016 - 6 January 2016.
- Extracting Relations between Non-Standard Entities using Distant Supervision and Imitation Learning. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, September 2015 - September 2015. View this article in WRRO
- Real-time Social Media Analytics through Semantic Annotation and Linked Open Data. Proceedings of the ACM Web Science Conference on ZZZ - WebSci '15, 28 June 2015 - 1 July 2015.
- Understanding climate change tweets: an open source toolkit for social media analysis. Proceedings of EnviroInfo and ICT for Sustainability 2015, 7 September 2015 - 9 September 2015.
- View this article in WRRO Who cares about sarcastic tweets? Investigating the impact of sarcasm on sentiment analysis.. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). Reykjavik, 26 May 2014 - 31 May 2014.
- Relation Extraction from the Web Using Distant Supervision (pp 26-41)
- Introduction. SWAIE 2014 - 3rd Workshop on SemanticWeb and Information Extraction, Proceedings of the Workshop (pp III)
- Microblog-genre noise and impact on semantic annotation accuracy. HT 2013 - Proceedings of the 24th ACM Conference on Hypertext and Social Media (pp 21-30)
- TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text. Proceedings of the International Conference on Recent Advances in Natural Language Processing
- Interlinking documents based on semantic graphs. Procedia Computer Science, Vol. 22 (pp 231-240)
- Multimodal sentiment analysis of social media. CEUR Workshop Proceedings, Vol. 1110 (pp 47-58)
- Knowledge extraction and consolidation from social media (KECSM 2012) :Preface. CEUR Workshop Proceedings, Vol. 895
- Large Scale Semantic Annotation, Indexing, and Search at The National Archives. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (pp 3487-3494)
- Entity extraction and consolidation for social web content preservation. CEUR Workshop Proceedings, Vol. 912 (pp 18-29)
- Using events for content appraisal and selection in Web archives. CEUR Workshop Proceedings, Vol. 779 (pp 98-107)
- Motivating intelligent email in business: An investigation into current trends for email processing and communication research. 2009 IEEE Conference on Commerce and Enterprise Computing, CEC 2009 (pp 476-482)
- Evaluating Evaluation Metrics for Ontology-Based Applications: Infinite Reflection.. LREC
- Benchmarking Textual Annotation Tools for the Semantic Web. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008 (pp 20-25)
- Ontology-based information extraction for business intelligence. SEMANTIC WEB, PROCEEDINGS, Vol. 4825 (pp 843-856)
- Natural language technology for information integration in business intelligence. BUSINESS INFORMATION SYSTEMS, PROCEEDINGS, Vol. 4439 (pp 366-380)
- Creating tools for morphological analysis of sumerian. Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006 (pp 1762-1765)
- Metrics for evaluation of ontology-based information extraction. EON 2006 - Evaluation of Ontologies for the Web: 4th International Workshop - Located at the 15th International World Wide Web Conference, WWW 2006
- Metrics for evaluation of ontology-based information extraction. CEUR Workshop Proceedings, Vol. 179
- Ontology-based information extraction for market monitoring and technology watch. CEUR Workshop Proceedings, Vol. 137 (pp 33-42)
- Extracting a domain ontology from linguistic resource based on relatedness measurements. 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings (pp 345-351)
- Multimedia indexing through multi-source and multi-language information extraction: the MUMIS project. DATA & KNOWLEDGE ENGINEERING, Vol. 48(2) (pp 247-264)
- A lightweight approach to coreference resolution for named entities in text. Anaphora Processing, Vol. 263 (pp 97-111)
- Populating a database from parallel texts using ontology-based information extraction. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, Vol. 3136 (pp 254-264)
- Automatic language-independent induction of gazetteer lists. Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004 (pp 709-712)
- Creation of reusable components and language resources for Named Entity Recognition in Russian. Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004 (pp 309-312)
- Using parallel texts to improve recall in botany. Recent Advances in Natural Language Processing III, Vol. 260 (pp 237-246)
- Automatic creation and monitoring of semantic metadata in a dynamic knowledge portal. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, Vol. 3192 (pp 65-74)
- Experiments with geographic knowledge for information extraction. Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references -, 31 May 2003.
- Rapid customization of an information extraction system for a surprise language.. ACM Trans. Asian Lang. Inf. Process., Vol. 2 (pp 295-300)
- Multilingual adaptations of ANNIE, a reusable information extraction tool. Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - EACL '03, 12 April 2003 - 17 April 2003.
- NE recognition without training data on a language you don’t speak. ACL Workshop on Multilingual and Mixed-language Named Entity Recognition: Combining Statistical and Symbolic Models. Sapporo, Japan
- GATE: A Unicode-based Infrastructure Supporting Multilingual Information Extraction. Proceedings of Workshop on Information Extraction for Slavonic and other Central and Eastern European Languages (IESL’03). Borovets, Bulgaria
- OLLIE. Proceedings of the HLT-NAACL 2003 workshop on Software engineering and architecture of language technology systems - SEALTS '03, 31 May 2003 - 31 May 2003.
- GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL’02). Philadelphia, USA
- Using a text engineering framework to build an extendable and portable IE-based summarisation system. Proceedings of the ACL-02 Workshop on Automatic Summarization -, 11 July 2002 - 12 July 2002.
- A framework and graphical development environment for robust NLP tools and applications.. ACL (pp 168-175)
- Using Human Language Technology for Automatic Annotation and Indexing of Digital Library Content (pp 613-625)
- A unicode-based environment for creation and use of language resources. Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002 (pp 66-71)
- Using GATE as an environment for teaching NLP. Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics -, 7 July 2002 - 7 July 2002.
- Extracting information for automatic indexing of multimedia material. Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002 (pp 669-676)
- How feasible is the reuse of grammars for Named Entity Recognition?. Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002 (pp 1412-1418)
- GATE: an architecture for development of robust HLT applications. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE (pp 168-175)
- Developing reusable and robust language processing components for information systems using GATE. 13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS (pp 223-227)
- Adapting a robust multi-genre NE system for automatic content extraction. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS AND APPLICATIONS, PROCEEDINGS, Vol. 2443 (pp 264-273)
- Access to multimedia information through multisource and multilanguage information extraction. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, Vol. 2553 (pp 160-171)
- GATE. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02, 7 July 2002 - 12 July 2002.
- Named Entity Recognition from Diverse Text Types. Recent Advances in Natural Language Processing 2001 Conference (pp 257-274-257-274). Tzigov Chark, Bulgaria
- Identifying terms by their family and friends. Proceedings of the 18th conference on Computational linguistics -, 31 July 2000 - 4 August 2000.
- Experience of using GATE for NLP R&D. Proceedings of the Workshop on Using Toolsets and Architectures To Build NLP Systems at COLING-2000. Luxembourg
- Creating and using domain-specific ontologies for terminological applications. 2nd International Conference on Language Resources and Evaluation, LREC 2000
- View this article in WRRO Comparing Topic-Aware Neural Networks for Bias Detection of News. Proceedings of ECAI 2020, 29 August 2020 - 2 September 2020.
- View this article in WRRO Combining expert knowledge with NLP for specialised applications.. Proc. of 23rd International Conference on Text, Speech and Dialogue. Brno, 29 August 2020 - 2 September 2020.
- View this article in WRRO Climate Change: A Chance for Political Re-Engagement?. Political Studies Association 65th Annual International Conference. Sheffield
Datasets
Preprints
- Examining Temporal Bias in Abusive Language Detection.
- Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus, arXiv. View this article in WRRO
- View this article in WRRO Local Media and Geo-situated Responses to Brexit: A Quantitative Analysis of Twitter, News and Survey Data, arXiv.
- Helping Crisis Responders Find the Informative Needle in the Tweet Haystack, arXiv.
- Analysis of Named Entity Recognition and Linking for Tweets, arXiv.
- View this article in WRRO Online Abuse of UK MPs in 2015 and 2017: Perpetrators, Targets, and Topics.
- A Framework for Real-Time Semantic Social Media Analysis.
- Editorial - Semantic Web Challange, 2010.
- Research group
-
Member of the Natural Language Processing research group.
- Grants
-
Current grants
- Influencing policy work on human rights violations against journalists, Research England, 09/2024 - 06/2025, £34,667, as PI
- Toolkit for Analysing and Visualising Online Violence Against Female Journalists, EPSRC, 04/2024 - 03/2025, £45,363, as PI
- Atrium: Advancing FronTier Research In the Arts and hUManities, Horizon Europe, 01/2024 - 12/2027, £370,950, as PI
Previous grants
- RISIS2: European Research Infrastructure for Science, technology and Innovation policy Studies 2, EC H2020, 01/2019 - 12/2022, £476,741, as co-PI
- Visualising the environmental impacts of plant-based recipes in Europe, Research England, 12/2021 - 05/2022, £18,407, as PI
- Calculating the environmental impact of plant based recipes, Industrial, 01/2021 - 12/2021, £2,500, as PI
- Pilot project on developing and trialling a toolkit for strengthening national context monitoring of violations against journalists, Free Press, 06/2020 - 12/2020, £29,094, as Co-PI
- Pilot project on developing a database for the improved collection and systematisation of information on incidents of violations against journalists, Free Press, 04/2019 - 11/2019, £29,030, as Co-I
- The Intelligent Automation of Contract Analysis of Collateral Warranties, Innovate UK, 03/2019 - 08/2020, £114,552, as PI
- Social Understandings of Scale: The role of Print and Social Media in the EU Referendum Debate, British Academy, 01/2018 - 06/2019, £49,716, as Co-PI
- Improving the monitoring of violence against journalists, Free Press, 12/2017 - 10/2018, £26,589, as Co-I
- KNOWMAK: Knowledge in the making in the European society, EC H2020, 01/2017 - 12/2019, £196,654, as PI
- COMRADES: Collective Platform for Community Resilience and Social Innovation during Crises, EC H2020, 01/2016 - 12/2018, £257,000, as PI