Dr Monica Lestari Paramita
BSc (Indonesia), MSc (Sheffield), PhD (Sheffield)
Information School
Lecturer in Data Science
Full contact details
Information School
Room C227
The Wave
2 Whitham Road
Sheffield
S10 2AH
- Profile
-
I obtained my BSc in Computer Science from the University of Indonesia in 2006 and MSc in Information Management from the University of Sheffield in 2008. Since 2008, I have worked as a researcher in diverse areas in Information Retrieval and Natural Language Processing.
My research roles included investigating cross-lingual similarity in the Web, developing systems to support information access (e.g., incorporating voice-based input, visualising bias and transparency in search results), and analysing users' behaviour when interacting with such systems.
I obtained my PhD from the University of Sheffield in 2019 where I developed approaches for identifying cross-lingual similarity in Wikipedia articles.
I joined the Information School as a Lecturer in Data Science in September 2021.
University Responsibilities
- Deputy Programme Coordinator for BSc Data Science
- Researcher Development Lead
- Research interests
-
My research focuses on the study of bias and transparency in information retrieval and multilingual information access. I am especially interested to investigate how bias-aware search engines should be designed to support users in their search tasks. I am also interested in researching cross-lingual similarity in Wikipedia; this includes creating methods to measure cross-lingual similarity, understanding why dissimilar information exists, and how this impacts different users (e.g., users in different locations or those speaking different languages).
I would be interested in supervising PhD topics in the following areas:
- bias and transparency in search engines
- multilingual information access, such as multilingual search and cross-lingual similarity in Wikipedia.
- Publications
-
Show: Featured publications All publications
Featured publications
Journal articles
- Do you see what I see? Images of the COVID-19 pandemic through the lens of Google. Information Processing & Management, 58(5). View this article in WRRO
- Report on the CyCAT winter school on fairness, accountability, transparency and ethics (FATE) in AI. ACM SIGIR Forum, 55(1). View this article in WRRO
Conference proceedings papers
- Europeana: What Users Search For and Why. Research and Advanced Technology for Digital Libraries (pp 207-219). Thessaloniki, Greece, 18 September 2017 - 18 September 2017. View this article in WRRO
- Using Section Headings to Compute Cross-Lingual Similarity of Wikipedia Articles. ECIR 2017: Advances in Information Retrieval (10193) (pp 663-669). Aberdeen, UK View this article in WRRO
- A Comparison of Approaches for Measuring Cross-Lingual Similarity of Wikipedia Articles (pp 424-429)
- Do User Preferences and Evaluation Measures Line Up?. Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval (pp 555-562). Geneva, Switzerland, 19 July 2010 - 19 July 2010. View this article in WRRO
All publications
Journal articles
- Towards improving user awareness of search engine biases: a participatory design approach. Journal of the Association for Information Science and Technology, 75(5), 581-599. View this article in WRRO
- Do you see what I see? Images of the COVID-19 pandemic through the lens of Google. Information Processing & Management, 58(5). View this article in WRRO
- Report on the CyCAT winter school on fairness, accountability, transparency and ethics (FATE) in AI. ACM SIGIR Forum, 55(1). View this article in WRRO
- Report on the CyCAT winter school on fairness, accountability, transparency and ethics (FATE) in AI.. SIGIR Forum, 55, 4:1-4:1.
- Motivations, understandings, and experiences of open-access mega-journal authors: Results of a large-scale survey. Journal of the Association for Information Science and Technology, 70(7), 754-768. View this article in WRRO
- Extracting bilingual terms from the Web. Terminology, 21(2), 205-236. View this article in WRRO
- Bootstrapping term extractors for multiple languages. Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014, 483-489.
- Bilingual dictionaries for all EU languages. Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014, 2839-2845.
Chapters
- Named Entity Recommendations to Enhance Multilingual Retrieval in Europeana.eu, Lecture Notes in Computer Science (pp. 102-112). Springer International Publishing
- Introduction, Theory and Applications of Natural Language Processing (pp. 1-11). Springer International Publishing
- Cross-Language Comparability and Its Applications for MT, Theory and Applications of Natural Language Processing (pp. 13-53). Springer International Publishing
- Collecting Comparable Corpora, Theory and Applications of Natural Language Processing (pp. 55-87). Springer International Publishing
- Product Classification Using Microdata Annotations, Lecture Notes in Computer Science (pp. 716-732). Springer International Publishing
- Appendices, Theory and Applications of Natural Language Processing (pp. 291-323). Springer International Publishing
- Building and Using Comparable Corpora In Sharoff S, Rapp R, Zweigenbaum P & Fung P (Ed.) Springer Berlin Heidelberg
- Methods for Collection and Evaluation of Comparable Documents, Building and Using Comparable Corpora (pp. 93-112). Springer Berlin Heidelberg
- Photographic Image Retrieval, The Information Retrieval Series (pp. 141-162). Springer Berlin Heidelberg
Conference proceedings papers
- View this article in WRRO Do origin and facts identify automatically generated text?. Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2023), Vol. 3496. Andalusia, Spain, 26 September 2023 - 26 September 2023.
- BASE: a Bias-Aware news Search Engine for improving user awareness [Prototype]. CEUR Workshop Proceedings, Vol. 3480 (pp 76-82)
- View this article in WRRO SNuC: The Sheffield Numbers Spoken Language Corpus. Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp 1978-1984). Marseille, France, 20 June 2022 - 20 June 2022.
- Europeana: What Users Search For and Why. Research and Advanced Technology for Digital Libraries (pp 207-219). Thessaloniki, Greece, 18 September 2017 - 18 September 2017. View this article in WRRO
- Using Section Headings to Compute Cross-Lingual Similarity of Wikipedia Articles. ECIR 2017: Advances in Information Retrieval (10193) (pp 663-669). Aberdeen, UK View this article in WRRO
- The SENSEI Overview of Newspaper Readers’ Comments. Advances in Information Retrieval. ECIR 2017.(10193) (pp 758-761). Aberdeen, UK, 8 April 2017 - 8 April 2017. View this article in WRRO
- The SENSEI Annotated Corpus: Human Summaries of Reader Comment Conversations in On-line News. Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue (pp 42-52). Los Angeles, USA, 13 September 2017 - 13 September 2017. View this article in WRRO
- View this article in WRRO What's the issue here?: Task-based evaluation of reader comment summarization systems. Proceedings of LREC 2016, Tenth International Conference on Language Resources and Evaluation (pp 2094-3101). Portorož, Slovenia, 23 May 2016 - 23 May 2016.
- A Graph-Based Approach to Topic Clustering for Online Comments to News. Advances in Information Retrieval, Vol. 9626 (pp 15-29). Padua, Italy, 20 March 2016 - 20 March 2016. View this article in WRRO
- Automatic label generation for news comment clusters. Proceedings of the 9th International Natural Language Generation conference (pp 61-69), 2016 - 2016.
- A Comparison of Approaches for Measuring Cross-Lingual Similarity of Wikipedia Articles (pp 424-429)
- Assigning Terms to Domains by Document Classification. Proceedings of the 4th International Workshop on Computational Terminology (Computerm) (pp 11-21), August 2014 - August 2014.
- Extracting bilingual terminologies from comparable corpora. ACL 2013 - 51st Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Vol. 1 (pp 402-411)
- View this article in WRRO Correlation between Similarity Measures for Inter-Language Linked Wikipedia Articles. LREC 2012 (pp 790-797). Istanbul, Turkey, 23 May 2012 - 23 May 2012.
- Collecting and using comparable corpora for statistical machine translation. Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012 (pp 438-445)
- CHiC 2011 - Cultural heritage in CLEF: From use cases to evaluation in practice for multilingual information access to cultural heritage. CEUR Workshop Proceedings, Vol. 1177
- Diversity in Photo Retrieval: Overview of the ImageCLEFPhoto Task 2009. Multilingual Information Access Evaluation II. Multimedia Experiments, Vol. 6242 (pp 45-59). Corfu, Greece, 30 September 2009 - 30 September 2009. View this article in WRRO
- Do User Preferences and Evaluation Measures Line Up?. Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval (pp 555-562). Geneva, Switzerland, 19 July 2010 - 19 July 2010. View this article in WRRO
- Generic and Spatial Approaches to Image Search Results Diversification (pp 603-610)
- Multiple approaches to analysing query diversity.. Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp 734-735). Boston, Massachusetts, 19 July 2009 - 19 July 2009. View this article in WRRO
- Identifying location in indonesian documents for geographic information retrieval. Proceedings of the 4th ACM workshop on Geographical information retrieval, Vol. 68 (pp 19-24)
Website content
Datasets
- Research group
-
I am part of the Information Retrieval research group.
- Teaching activities
-
INF113 - Data-Driven Organisations
INF214 - Using Data for Responsible Decision Making
INF6027 - Introduction to Data Science
INF6060 - Information Retrieval: Search Engines and Digital Libraries
- Professional activities and memberships
-
- Associate Fellow of the Higher Education Academy
- Committee Member of the British Computer Society's Information Retrieval Specialist Group (BCS IRSG)
- Co-lead of the Shef.AI Interest Group: "Revolutionising data-driven research in the arts, humanities, and social sciences"