Professor Hamish Cunningham
School of Computer Science
Professor of Internet Computing
Impact Officer
Member of the Natural Language Processing (NLP) research group
+44 114 222 1891
Full contact details
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
- Profile
-
Hamish Cunningham is a Research Professor in Computer Science at the University of Sheffield and an executive committee member of the Institute for Sustainable Food. He used to hope that as time passed he would get older and wiser, but it seems that in fact he just gets odder and wider.
He has been a software engineer, researcher, open source developer and Principal Investigator on some €12m of funded research, and in 2014 ran a successful crowdfunding campaign to produce the MoPi mobile power board for the Raspberry Pi.
He has published some 200 peer-reviewed articles (cited more than 10,000 times), served on a number of editorial boards and reviewed project proposals for the EC, EPSRC, BBSRC, ESRC, NWO and others. His team produced the GATE open source platform for language and knowledge research, which is used by organisations as diverse as the BBC, WHO cancer research and the Financial Times, and which has attracted around €20 million euros of direct research funding at Sheffield alone.
Cunningham is currently researching open Internet of Things (IoT) devices for domestic aquaponics, has been a management committee member of the COST network EU Aquaponics Hub and the owner of a small greenhouse full of fish.
He teaches the COM3505 3rd-year undergraduate course on the IoT and will one day finish his next book, The Internet of Things: mi Casa su Botnet?
He believes that open source technology has a contribution to make to sustainability and resilience, and that political democracy is proving incapable of saving the planet due to our complete lack of economic democracy.
Find him at hamish.gate.ac.uk.
- Research interests
-
-
aquaponics monitoring and control systems for sustainable intensive food production
-
physical computing; micro-manufacturing; maker culture; Raspberry Pi
-
privacy-preserving social media
-
language analysis infrastructure, text mining and textual big data processing
-
- Publications
-
Books
- Text Processing with Gate (Version 6). GATE.
- Mi Casa su Botnet? Learning the Internet of Things with WaterElf, unPhone and the ESP32.
Journal articles
- The hidden potential of urban horticulture. Nature Food, 1, 155-159. View this article in WRRO
- Grow your own food security? Integrating science and citizen science to estimate the contribution of own growing to UK food production. Plants, People, Plant. View this article in WRRO
- Mímir: An open-source semantic search framework for interactive information seeking and discovery. Journal of Web Semantics, 30, 52-68.
- GATE Teamware: a web-based, collaborative text annotation framework. Language Resources and Evaluation, 47(4), 1007-1029. View this article in WRRO
- View this article in WRRO Getting More out of Biomedical Documents with GATE's Full Lifecycle Open Source Text Analytics.. PLoS Computational Biology.
- Improving Habitability of Natural Language Interfaces for Querying Ontologies with Feedback and Clarification Dialogues. Journal of Web Semantics.
- Khresmoi - multilingual semantic search of medical text and images.. Stud Health Technol Inform, 192, 1266.
- View this article in WRRO GATECloud.net: a Platform for Large-Scale, Open-Source Text Processing on the Cloud. Philosophical Transactions of the Royal Society A. Mathematical, Physical and Engineering Sciences.
- Agile Research. CoRR, abs/1202.0652.
- Using prior information from the medical literature in GWAS of oral cancer identifies novel susceptibility variant on Chromosome 4 - the AdAPT method. PLoS ONE, 7(5). View this article in WRRO
- Random indexing for finding similar nodes within large RDF Graphs. CEUR Workshop Proceedings, 737, 36-50.
- Towards Controlled Natural Language for Semantic Annotation. INT J SEMANT WEB INF, 6(4), 64-91.
- On designing controlled natural languages for semantic annotation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5972 LNAI, 187-205.
- Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6107 LNCS.
- Controlled natural language for semantic annotation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5554 LNCS, 816-820.
- Adapting SVM for data sparseness and imbalance: A case study in information extraction. Natural Language Engineering, 15(2), 241-271.
- Geometric and quantum methods for information retrieval.. SIGIR Forum, 42, 22-32.
- Adopting ontologies for multisource identity resolution. Proceedings of the 1st International Workshop on Ontology-supported Business Intelligence, OBI 2008.
- Towards LarKC: A platform for Web-scale reasoning. Proceedings - IEEE International Conference on Semantic Computing 2008, ICSC 2008, 524-529.
- Semantic analysis for tomorrow's audio-visual digital archives. IET Seminar Digest, 2005(11099), 373-380.
- Knowledge management and human language: Crossing the chasm. Journal of Knowledge Management, 9(5), 108-131.
- Corpus Linguistics and South Asian Languages: Corpus Creation and Tool Development.. Lit. Linguistic Comput., 19, 509-524.
- Software architecture for language engineering. Natural Language Engineering, 10(3-4), 205-209.
- Evolving GATE to meet new challenges in language engineering. Natural Language Engineering, 10(3-4), 349-373.
- GATE, a general architecture for text engineering. COMPUT HUMANITIES, 36(2), 223-254.
- Architectural Elements of Language Engineering Robustness. Journal of Natural Language Engineering, 8(2-3), 257-274.
- The role of taxonomy in language engineering - Discussion. PHILOS T ROY SOC A, 358(1769), 1354-1355.
- A definition and short history of Language Engineering.. Nat. Lang. Eng., 5, 1-16.
- Information Extraction - A User Guide. CoRR, cmp-lg/9702006.
- New Methods, Current Trends and Software Infrastructure for NLP. Proceedings of NEMLAP-2.
- A General Architecture for Language Engineering (GATE) - a new approach to Language Engineering R&D.
Chapters
- Semantic search over documents and ontologies (pp. 31-53).
- Software reuse, object-oriented frameworks and NLP, New Methods In Language Processing (pp. 357-366).
- Semantic Annotations and Retrieval: Manual, Semiautomatic, and Automatic Generation, Handbook of Semantic Web Technologies (pp. 77-116). Springer Berlin Heidelberg
- Information Extraction and Semantic Annotation for Multi-Paradigm Information Management, Current Challenges in Patent Information Retrieval (pp. 307-327). Springer Berlin Heidelberg
- Indexing and querying linguistic metadata and document content, Recent Advances in Natural Language Processing IV (pp. 35-44). John Benjamins Publishing Company
- Information Extraction, Automatic, Encyclopedia of Language & Linguistics (pp. 665-677). Elsevier
- Semantic Annotation and Human Language Technology, Semantic Web Technologies (pp. 29-50). John Wiley & Sons, Ltd
Conference proceedings papers
- Towards Lightweight Authorisation of IoT-Oriented Smart-Farms using a Self-Healing Consensus Mechanism. 2022 31st Conference of Open Innovations Association (FRUCT), Vol. 00 (pp 265-276)
- Threat Modelling with the GDPR towards a Security and Privacy Metrics Framework for IoT Smart-farm Application. Proceedings of the 7th International Conference on Internet of Things, Big Data and Security (pp 91-102)
- Low-Energy Authentication with Selective Privacy for Heterogeneous IoT Devices in Smart-Farms. Conference of Open Innovation Association, FRUCT, Vol. 2021-October (pp 230-238)
- Selective privacy in iot smart-farms for battery-powered device longevity. 20th International Conferences on WWW/Internet 2021 and Applied Computing 2021 (pp 137-146)
- Socio-political perspectives on surveillance and censorship: Implications for on-line privacy in the age of cloud computing. Computing Conference, 2017, 18 July 2017 - 20 July 2017.
- AnnoMarket – Multilingual text analytics at scale on the cloud. The Semantic Web: ESWC 2014 Satellite Events, Vol. 8798 (pp 315-319)
- AnnoMarket: An Open Cloud Platform for NLP. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp 19-24)
- Random Indexing for Finding Similar Nodes within Large RDF Graphs.. ESWC Workshops, Vol. 7117 (pp 156-171)
- FREyA: An Interactive Way of Querying Linked Data Using Natural Language.. ESWC Workshops, Vol. 7117 (pp 125-138)
- Identification of the question focus: Combining syntactic analysis and ontology-based lookup through the user interaction. Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010 (pp 361-368)
- Scaling Up High-Value Retrieval to Medium-Volume Data. ADVANCES IN MULTIDISCIPLINARY RETRIEVAL, Vol. 6107 (pp 1-5)
- Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and Ontology-Based Lookup through the User Interaction.. ESWC (1), Vol. 6088 (pp 106-120)
- Using Prior Information Attained from the Literature to Improve Ranking in Genome-wide Association Studies. GENETIC EPIDEMIOLOGY, Vol. 33(8) (pp 798-798)
- A framework for identity resolution and merging for multi-source information extraction. Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008 (pp 1367-1372)
- Large-scale, parallel automatic patent annotation.. PaIR (pp 1-8)
- RoundTrip Ontology Authoring. SEMANTIC WEB - ISWC 2008, Vol. 5318 (pp 50-65)
- CLOnE: Controlled language for ontology editing. SEMANTIC WEB, PROCEEDINGS, Vol. 4825 (pp 142-155)
- Experiments of Opinion Analysis on the Corpora MPQA and NTCIR-6.. NTCIR
- SVM Based Learning System for F-term Patent Classification.. NTCIR
- Further use of Controlled Natural Language for Semantic Annotation of Wikis. Proceedings of the 1st Semantic Authoring and Annotation Workshop at ISWC2006. Athens, Georgia, USA
- User-friendly ontology authoring using a controlled language. Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006 (pp 35-40)
- Creating tools for morphological analysis of sumerian. Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006 (pp 1762-1765)
- Automatic extraction of hierarchical relations from text. SEMANTIC WEB: RESEARCH AND APPLICATIONS, PROCEEDINGS, Vol. 4011 (pp 215-229)
- Mining information for instance unification. Semantic Web - ISEC 2006, Proceedings, Vol. 4273 (pp 329-342)
- Perceptron Learning for Chinese Word Segmentation.. SIGHAN@IJCNLP 2005
- Using uneven margins SVM and Perceptron for information extraction. CoNLL 2005 - Proceedings of the Ninth Conference on Computational Natural Language Learning (pp 72-79)
- Indexing and querying linguistic metadata and document content. International Conference Recent Advances in Natural Language Processing, RANLP, Vol. 2005-January (pp 74-81)
- Web-assisted annotation, semantic indexing and search of television and radio news.. WWW (pp 225-234)
- SVM based learning system for Information Extraction. DETERMINISTIC AND STATISTICAL METHODS IN MACHINE LEARNING, Vol. 3635 (pp 319-339)
- Extracting a domain ontology from linguistic resource based on relatedness measurements. 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings (pp 345-351)
- Multimedia indexing through multi-source and multi-language information extraction: the MUMIS project. DATA & KNOWLEDGE ENGINEERING, Vol. 48(2) (pp 247-264)
- Automatic language-independent induction of gazetteer lists. Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004 (pp 709-712)
- Large scale experiments for semantic labeling of noun phrases in raw text. Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004 (pp 811-814)
- Populating a database from parallel texts using ontology-based information extraction. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, Vol. 3136 (pp 254-264)
- A lightweight approach to coreference resolution for named entities in text. Anaphora Processing, Vol. 263 (pp 97-111)
- Using parallel texts to improve recall in botany. Recent Advances in Natural Language Processing III, Vol. 260 (pp 237-246)
- Automatic creation and monitoring of semantic metadata in a dynamic knowledge portal. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, Vol. 3192 (pp 65-74)
- MUMIS – ADVANCED INFORMATION EXTRACTION FOR MULTIMEDIA INDEXING AND SEARCHING. Digital Media Processing for Multimedia Interactive Services
- Experiments with geographic knowledge for information extraction. Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references -, 31 May 2003.
- Event-coreference across multiple, multi-lingual sources in the Mumis project. Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - EACL '03, 12 April 2003 - 17 April 2003.
- Intelligent Multimedia Indexing and Retrieval through Multi-source Information Extraction and Merging.. IJCAI (pp 409-414)
- OLLIE. Proceedings of the HLT-NAACL 2003 workshop on Software engineering and architecture of language technology systems - SEALTS '03, 31 May 2003 - 31 May 2003.
- GATE: A Unicode-based Infrastructure Supporting Multilingual Information Extraction. Proceedings of Workshop on Information Extraction for Slavonic and other Central and Eastern European Languages (IESL’03). Borovets, Bulgaria
- NE recognition without training data on a language you don’t speak. ACL Workshop on Multilingual and Mixed-language Named Entity Recognition: Combining Statistical and Symbolic Models. Sapporo, Japan
- Robust generic and query-based summarisation. Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - EACL '03, 12 April 2003 - 17 April 2003.
- Multilingual adaptations of ANNIE, a reusable information extraction tool. Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - EACL '03, 12 April 2003 - 17 April 2003.
- Rapid customization of an information extraction system for a surprise language.. ACM Trans. Asian Lang. Inf. Process., Vol. 2 (pp 295-300)
- GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL’02). Philadelphia, USA
- Using a text engineering framework to build an extendable and portable IE-based summarisation system. Proceedings of the ACL-02 Workshop on Automatic Summarization -, 11 July 2002 - 12 July 2002.
- How feasible is the reuse of grammars for Named Entity Recognition?. Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002 (pp 1412-1418)
- EMILLE, A 67-million word corpus of indic languages: Data collection, mark-up and harmonisation. Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002 (pp 819-825)
- Extracting information for automatic indexing of multimedia material. Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002 (pp 669-676)
- Using GATE as an environment for teaching NLP. Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics -, 7 July 2002 - 7 July 2002.
- A unicode-based environment for creation and use of language resources. Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002 (pp 66-71)
- Using Human Language Technology for Automatic Annotation and Indexing of Digital Library Content (pp 613-625)
- Human Language Technology for Automatic Annotation and Indexing of Digital Library Content (pp 658-658)
- A framework and graphical development environment for robust NLP tools and applications.. ACL (pp 168-175)
- GATE: an architecture for development of robust HLT applications. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE (pp 168-175)
- Adapting a robust multi-genre NE system for automatic content extraction. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS AND APPLICATIONS, PROCEEDINGS, Vol. 2443 (pp 264-273)
- Access to multimedia information through multisource and multilanguage information extraction. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, Vol. 2553 (pp 160-171)
- Developing reusable and robust language processing components for information systems using GATE. 13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS (pp 223-227)
- The automatic generation of formal annotations in a multimedia indexing and searching environment. Proceedings of the workshop on Human Language Technology and Knowledge Management -, 6 July 2001 - 7 July 2001.
- GATE. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02, 7 July 2002 - 12 July 2002.
- Using HLT for acquiring, retrieving and publishing knowledge in AKT. Proceedings of the workshop on Human Language Technology and Knowledge Management -, 6 July 2001 - 7 July 2001.
- Named Entity Recognition from Diverse Text Types. Recent Advances in Natural Language Processing 2001 Conference (pp 257-274-257-274). Tzigov Chark, Bulgaria
- Software infrastructure for language resources: A taxonomy of previous work and a requirements analysis. 2nd International Conference on Language Resources and Evaluation, LREC 2000
- Experience of using GATE for NLP R&D. Proceedings of the Workshop on Using Toolsets and Architectures To Build NLP Systems at COLING-2000. Luxembourg
- Implementing a sense tagger in a general architecture for text engineering. Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning - NeMLaP3/CoNLL '98, 11 January 1998 - 17 January 1998.
- Sense tagging and language engineering. ECAI 1998: 13TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS (pp 185-189)
- University of Sheffield: Description of the LaSIE-II system as used for MUC-7. 7th Message Understanding Conference, MUC 1998 - Proceedings
- Visual Execution and Data Visualization in Natural Language Processing.. VL (pp 342-347)
- Software infrastructure for natural language processing. Proceedings of the fifth conference on Applied natural language processing -, 31 March 1997 - 3 April 1997.
- GATE. Proceedings of the fifth conference on Applied natural language processing Descriptions of system demonstrations and videos -, 31 March 1997 - 3 April 1997.
- Visual execution and data visualisation in natural language processing. 1997 IEEE SYMPOSIUM ON VISUAL LANGUAGES, PROCEEDINGS (pp 338-343)
- TIPSTER-compatible projects at Sheffield. Proceedings of a workshop on held at Vienna, Virginia May 6-8, 1996 -, 6 May 1996 - 8 May 1996.
- GATE. Proceedings of the 16th conference on Computational linguistics -, 5 August 1996 - 9 August 1996.
- GATE: An environment to support research and development in natural language engineering. EIGHTH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS (pp 58-66)
- University of Sheffield. Proceedings of the 6th conference on Message understanding - MUC6 '95, 6 November 1995 - 8 November 1995.
- Software Infrastructure for Natural Language Processing. 5th Conference on Applied Natural Language Processing, 1997
Reports
- Taxonomic Classification of IoT Smart Home Voice Control
- Challenges in Document Mining: Report from Dagstuhl Seminar 11171.
Software / Code
- http://gate.ac.uk/ GATE, a General Architecture for Text Engineering. Sheffield, UK: University of Sheffield Retrieved from
Preprints
- Taxonomic Classification of IoT Smart Home Voice Control.
- Selective privacy in IoT smart-farms for battery-powered device longevity, arXiv.
- Software Infrastructure for Natural Language Processing, arXiv.
- New Methods, Current Trends and Software Infrastructure for NLP, arXiv.
- A General Architecture for Language Engineering (GATE) - a new approach to Language Engineering R&D, arXiv.
- MMmir: An Open-Source Semantic Search Framework for Interactive Information Seeking and Discovery.
- Improving Habitability of Natural Language Interfaces for Querying Ontologies With Feedback and Clarifcation Dialogues.
- Grants
-
Current grants
- Resilient Campus, Resilient City (RC²), HEIF, 02/2019 to 07/2019, £54,000, as PI
- Project 2, MRC, 10/2018 to 02/2019, £22,744, as PI
- IRF II, Matrixware Information Services GMBH, 03/2008 to 12/2020, £110,687, as PI
- IRF IIII, Matrixware Information Services GMBH, 10/2009 to 12/2020, £133,857, as PI
- IRF V, Matrixware Information Services GMBH, 10/2009 to 12/2020, £36,506, as PI
- SoBigData Research Infrastructure, EC - H2020, 09/2015 to 08/2019, £649,690, as PI
Previous grants
- Low Resource Aquaponic Agriculture in Nepal, EPSRC, 10/2017 to 03/2018, £38,120, as PI
- MLi: Towards a MultiLingual Data Services infrastructure, EC - FP7, 11/2013 to 04/2016, £199,409, as PI
- ForgetIT: Concise Preservation by combining Managed Forgetting and Contextualized Remembering, EC - FP7, 02/2013 to 01/2016, £470,399, as PI
- AnnoMarket: Annotation Resource Marketplace in the Cloud, EC - FP7, 06/2012 to 05/2014, £394,226, as PI
- ARCOMEM: ARchive COmmunities MEMories, EC - FP7, 01/2011 to 12/2013, £764,781, as PI
- GATE Cloud Exploratory: Adapting the General Architecture for Text Engineering to Cloud Computing, EPSRC, 02/2011 to 10/2011, £71,677, as PI
- KHRESMOI: Knowledge Helper for Medical and Other Information users, EC - FP7, 09/2010 to 08/2014, £432,524, as PI
- IRF III, INFORMATION RETRIEVAL FACILITY, 03/2009 to 12/2014, £97,489, as PI
- LarKC: The Large Knowledge Collider: a platform for large scale integrated reasoning and Web-search, EC - FP7, 04/2008 to 09/2011, £432,262, as PI
- IRF, Information Retrieval Facility, 11/2007 to 10/2010, £40,280, as PI
- MediaCampaign: Discovering, EC - FP6, 04/2006 to 09/2008, £417,594, as PI
- NeOn: Lifecycle support for Networked Ontologies, EC - FP6, 03/2006 to 02/2010, £508,373, as PI
- IBM, IBM US, 11/2005 to 10/2008, £19,354, as PI
- LIRICS: Linguistic infrastructure for interoperable resources and systems, EC - FP6, 01/2005 to 06/2007, £230,976, as PI
- PrestoSpace: Preservation towards storage and access. Standardised Practices for Audiovisual Contents Archiving in Europe, EC - FP6, 02/2004 to 01/2008, £276,778, as PI
- Knowledge-Web, EC - FP6, 01/2004 to 12/2007, £136,674, as PI
- SEKT: Semantically-Enable Knowledge Technologies, EC - FP6, 01/2004 to 12/2006, £713,345, as PI
- Generic tools for linguistic annotation and web-based analysis of literary sumerian, AHRC, 10/2003 to 09/2006, £152,360, as PI
- h-TechSight: A Knowledge management platform with Intelligence & Insight capabilities for Technology Intensive Industries, EC - FP6, 04/2003 to 09/2004, £69,000, as PI
- ONTOWEB, EC - FP6, 12/2002 to 06/2004, £6,395, as PI
- MultiFlora II: a robust wellgrounded biodiversity information system from multiple text sources, BBSRC, 05/2002 to 11/2003, £29,156, as PI
- Information extraction from company reports, Health and Safety Laboratory, 09/2001 to 09/2003, £35,000, as PI
- Professional activities and memberships
-
Research proposal reviewer for
- EPSRC (the Engineering and Physical Sciences Research Council, UK)
- the European Commission (FP6 IST, FP7 IST, ERC)
- BBSRC (the Biotechnology and Biological Sciences Research Council, UK)
- ESRC (the Economic and Social Research Council, UK)
- NWO (the Netherlands Organization for Scientific Research)
- NSERC (Natural Sciences and Engineering Research Council of Canada)
- IWT (Belgian Institute for the Promotion of Innovation by Science and Technology)
Schools outreach
- Accredited STEM Ambassador (DBS/CRB certified).
- Sheffield Cutlers’ Ambassador Scheme Raspberry Pi programme.
Journals
- Editorial Board member for the journal of Language Resources and Evaluation23.
- Area Chair for language processing and Editorial Board member for the Journal of Web Semantics (2005-2009).
- Editor of special issue of the Journal of Natural Language Engineering on Software Architecture for Language Engineering (2004)24.
- Reviewer for IBM Systems Journal.
- Reviewer for ACM Transactions on Information Systems (TOIS).
- Reviewer for the Special Issue of Lingvisticae Investigationes on Named Entities: Recognition, Classification and Use.
Advisory boards and professional associations
- Member of the Council of Professors and Heads of Computer Science (CPHC).
- Founding Scientific Board member of the Information Retrieval Facility for large-scale IR experimentation.
- Advisory group member of the International Internet Preservation Consortium.
- Technical committee of the European Cultural Heritage Online project.
Standardisation activities
- Founder member of OASIS/Open standardisation committee on Unstructured Information Management.
- Principal investigator for Sheffield on LIRICS project for ISO TC37/SC4 standards team.
- Member of the British Standards Institute committee TS/1 (Language Resources and Terminology).
- Participant in ISO TC37/SC4 workshop on annotation standards, Pont a Mousson, November 2002.
Conference and workshop organisation
- Co-chair of the Dagstuhl workshop Challenges in Document Mining, May 2011.
- General chair of the first Information Retrieval Facility conference (IRFC 2010), Vienna, May 2010.
- Co-organiser of the New Challenges for NLP Architectures workshop at LREC 2010, Malta, May 2010.
- Organising committee of the workshop on Crossing Media for Improved Information Access at LREC, Genoa, May 2006.
- Scientific committee of CASCON 2006, the 16th Annual International Conference of IBM Centers for Advanced Studies, Dublin, October 2006.
- Co-proposer of Summer Workshop on Language Engineering (chair: Louise Guthrie), CLSP at Johns Hopkins University Baltimore, MA, USA. July 14 to August 22, 2003.
- Co-chair of workshop on Software Engineering and Architecture of Language Technology Systems (SEALTS) at HLT-NAACL 2003.
- Co-chair of workshop on Human Language Technology for the Semantic Web and Web Services at International Semantic Web Conference 2003.
- Programme chair of Workshop on Information Extraction for Slavonic and other Eastern and Central European Languages, RANLP 2003.
- Organising committee of the LREC-2000 workshop on Meta-Descriptions and Annotation
- Organising committee of the LREC-2000 workshop on Schemas for Multimodal/Multimedia Language Resources and Data Architectures and Software Support for Large Corpora.
- Organising committee of the COLING-2000 Workshop on Using Toolsets and Architectures To Build NLP Systems, Centre Universitaire, Luxembourg, 5 August 2000.
- Co-chair of the EPSRC Workshop on NLP Architectures and Language Resources, Baslow, December 1998.
- Co-chair of the Distributing and Accessing Language Resources workshop, Granada LREC conference, May 1998.
- Scholarships, invited lectures and tutorials:
- ANR Chaire d’Excellence, Internet Memory, Paris, France, 2011-2012.
- Invited speaker, Text Analytics 2009, Boston, US.
- Visiting Professor, Université Joseph Fourier, Grenoble, 2009.
- Invited speaker Discovery Knowledge and Informatics, Amsterdam, April 2007.
- Panellist on Applications of Memories for Life, British Library symposium, December 2006.
- Invited speaker at conference on European Digital Cultural Heritage, Salzburg, June 2006.
- Invited lecturer, EUROLAN 2005, Iasi, Romania.
- Visiting Scientist, DERI, National University of Ireland at Galway, 2004-2006.
- Tutorial on Human Language Technology for the Semantic Web at the European Semantic Web Symposium, Heraklion, Crete, May 2004.
- Invited speaker, ILASH workshop on Human Language Technologies for the Semantic Web: After OWL: Defacto Standards for Semantic Technology, Sheffield, March 2004.
- Visiting Scholar, Johns Hopkins University Center for Language and Speech Processing, Summer 2003.
- Invited lecture at IBM TJ Watson laboratory: Software Architecture for Language Engineering. August 2003.
- Invited tutorial on Named Entity Recognition at RANLP 2003.
- Invited lecturer on Information Extraction for the EuroLan Summer School, Iasi, Romania, 2001.
- Invited lecturer for HLT Center of Excellence in Information Society Technologies in 21 century (EC HLT project BIS-21) at the Linguistic Modelling Lab, Bulgarian Academy of Sciences.
- Invited presentation at British Classification Society workshop on Computer Text Analysis, Feb 2001, Dept. Probability and Statistics, Univ. Sheffield: GATE, A General Architecture for Text Engineering.
Programme committee memberships
- LREC 2014.
- 24th International Conference on Computational Linguistics (COLING 2012), Mumbai, India, December 2012.
- 16th International World Wide Web Conference (WWW2007), Banff, Canada, May 2007.
- 10th biannual international Congress of the Italian Association for Artificial Intelligence (AI*IA), 2007
- WWW 2006, the World-Wide Web conference, Edinburgh, UK, May 2006.
- EACL 2006, the 11th Meeting of the European Chapter of the Association for Computational Linguistics, April 3-7 2006, Italy.
- Senior Programme Committee of ISWC2006, the Fifth International Semantic Web Conference, Athens, USA, November 5-9, 2006.
- Web Content Mining with Human Language Technologies workshop at the International Semantic Web Conference, November 2006.
- The Semantic Desktop and Social Semantic Collaboration Workshop at the International Semantic Web Conference, 6 November 2006, Athens, GA, USA.
- OntoLex 2006, workshop at LREC 2006, Genoa, Italy.
- European Semantic Web Conference (ESWC), Budva, Montenegro, June 2006.
- Workshop on Multi-dimensional Markup in NLP at EACL 2006.
- Natural Language Processing for Metadata Extraction (NLP4ME 2006), AIMSA, Varna, Bulgaria, September 13-15, 2006.
- Workshop on Web Content Mining with Human Language Technologies, ISWC, Athens, GA, U.S.A. November 5-9 2006.
- IJCAI Edinburgh, UK, July/August 2005.
- Workshop on Multimedia and the Semantic Web, 2nd European Semantic Web Conference, Crete, May / June 2005.
- RANLP 2005 (Recent Advances in Natural Language Processing), Borovetz, Bulgaria, 2005.
- Workshop on Text Mining Research, Practice and Opportunities, RANLP 2005.
- Workshop on End-User Semantic Web Interaction, ISWC 2005, Galway, Ireland.
- IJCAI workshop on Natural Language Generation and the Semantic Web: Perspective and Challenges, Edinburgh, UK, July/August 2005.
- EUROLAN 2005, Multilingual Aligned Resources and their use in the context of Knowledge Web, July/August 2005, Cluj-Napoca, Romania.
- Second European Semantic Web Conference (ESWC), Heraklion, Crete, Greece, May 29 to 1 June, 2005.
- First International Workshop on Representation and Analysis of Web Space (RAWS-05), Prague-Tocna, September 15-16, 2005.
- Senior Programme Committee of ISWC2004, the Third International Semantic Web Conference, Hiroshima, Japan, November 7-11, 2004.
- ACL 2004: 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, July 21-26, 2004.
- Workshop on the Semantic Web at SIGIR 2004, Sheffield, UK, July 2004.
- IJCNLP-04 (International Joint Conference on Natural Language Processing), Hainan Island, China, March 22-24, 2004.
- ESWS (First European Semantic Web Symposium), Heraklion, Greece, May 10-12, 2004.
- Additional reviewer for COLING 2004.
- BIS 2004: 7th International Conference on Business Information Systems, Poznan, Poland, April 21-23 2004.
- ECAI 2004 workshop on Application of Semantic Web Technologies to Web Communities, Valencia, Spain, August 23rd, 2004.
- ECAI 2004 workshop on Ontology Learning and Population from Text, Valencia, Spain, August, 2004.
- RDF/RDFS and OWL in Language Technology: 4th Workshop on NLP and XML, ACL-2004, Barcelona, 2004.
- Workshop on NLP for Multimedia Applications, 16th ESSLI, 16-20 August, Nancy 2004.
- Workshop on the Semantic Web at SIGIR 2003, Toronto, Canada, July 2003.
- RANLP 2003 (Recent Advances in Natural Language Processing), Borovetz, Bulgaria, 2003.
- EACL 2003 workshop on Evaluation Initiatives in Natural Language Processing.
- EACL 2003 workshop on Language Technology and the Semantic Web (the 3rd Workshop on NLP and XML).
- Second Workshop on NLP and XML (NLPXML-2002).
- RANLP 2001 (Recent Advances in Natural Language Processing), Tzigov Chark, Bulgaria, 2003.