Open Research Case Studies
Open research is the practice of making the processes and outputs of research transparent and freely accessible, whenever possible.
The case studies below, gathered from the winners and runners up of the University of Sheffield’s Open Research Prize (first held in 2021 and most recently in 2023), demonstrate some of the excellent practice in open research taking place across the University.
Winners and runners-up of the Open Research Prize 2023
These case studies showcase the work of the winners and runners up of the Open Research Prize 2023.
The winner of the individual staff category was Harry Wright of the Department of Chemistry, for the development of FoamPi, an affordable open-source hardware solution to measure polyurethane foam reaction kinetics.
The individual PGR student winner was Shuangke Jiang of the Department of Psychology, who pre-registered studies, openly shared data and code, and conducted replication work in the field of cognitive psychology.
A team prize was awarded to Alice Pyne, Neil Shephard, Max C Gamill, Sylvia Whittle and Mingxue Du for their work on TopoStats, an open source software tool in the Atomic Force Microscopy (AFM) field which adheres to the FAIR4RS (FAIR for Research Software) principles. In addition, five runner up prizes were awarded.
The case studies are also available in PDF format in the University of Sheffield data repository, ORDA: https://doi.org/10.15131/shef.data.c.6749535
-
Harry Wright - Developing a free and open source low-cost alternative for measuring polyurethane foam reaction kinetics
-
Video case study
Written case study
My PhD focused on developing novel polyurethane foam-based growing media for high-tech hydroponic crop growth techniques. During synthesis of these foams, it is essential to ensure that reactions progress to complete conversion. This avoids unreacted reagents in the final foam, which could feasibly leach into the plants. The easiest way to measure polyurethane reaction progression is to measure the change in temperature, height and mass of the reacting foam and use a method called adiabatic temperature rise to determine the extent of reaction. This is usually done using specialised equipment that costs more than £20,000. With the support of my PhD supervisors, Professor Tony Ryan and Professor Duncan Cameron, I developed the FoamPi, an affordable and accessible solution for laboratories with limited resources that measures these parameters.
Designing an open, affordable solution
The FoamPi apparatus can be built for a cost of £350, with an even cheaper alternative-build option available for only £125. This cost-effective solution provides research labs with the opportunity to gain valuable insights into exothermic chemical reactions, particularly exothermic foaming reactions. As a researcher who has experienced first-hand the scarcity of funding and resources for material analysis in research labs in South Africa, I am proud to have developed an open, easily reproducible, low-cost hardware solution.
Using open source components and specifications
The FoamPi is built using open-source components such as a Raspberry Pi 4, Seeed Studio sensors, and a physical structure made from laser-cut plywood. The design and build details have been published in the journal HardwareX [1], a multidisciplinary, open-source journal for scientific hardware that aligns with the FAIR principles. The journal was selected as it promises to make scientific hardware accessible, discoverable, citable, comprehensible, reviewed and reproducible. The technical drawings and code have been stored on the Open Science Framework (OSF) [2] with a static DOI and registration on OSF [3], ensuring that all files are the same version as described in the paper. The code is also available on Github [4], allowing for future improvements and updates.
The FoamPi has already been successfully used in our lab, enabling precise, rapid, and reproducible monitoring of polyurethane foam reactions. We have also developed a simplified temperature-only logging thermocouple that has been used in other lab groups in the Department of Chemistry. In the future, I hope that more labs will adopt the FoamPi and work with us to improve its hardware and software using co-creation principles.
Other open source hardware solutions
My passion for open research extends beyond the FoamPi, and I have also developed other open-source hardware solutions. These include an airflow meter for measuring airflow through foam, which can determine the ratio of open and closed cells, an essential morphological characteristic. I am currently working to refine the design and develop the instrument according to the FAIR principles (funding-dependent). Additionally, I am the lead author of a manuscript that developed a set of Python scripts to determine the colour of plant tissues from digital images, providing a cheap, easy, and non-destructive method of analysing plant tissue. This manuscript is currently undergoing peer review and is available as a preprint [5], with scripts available on OSF [6]. All research data is stored in the University of Sheffield’s ORDA repository [7], and the code is available on Github [8].
In summary, my work on developing the FoamPi and other open-source solutions reflects my dedication to making research accessible, affordable, and reproducible with particular thought to enabling low-cost solutions for underfunded lab groups. I believe that these initiatives are in line with the principles of open research, FAIR and FAIR4RS, and I hope to continue contributing to this field in the future.
References
[1] Wright, H.C., Cameron, D.D. and Ryan, A.J. (2022). FoamPi: An open-source raspberry Pi based apparatus for monitoring polyurethane foam reactions. HardwareX, 12, pp. e00365–e00365. https://doi.org/10.1016/j.ohx.2022.e00365
[2] Wright, Harry (2022). FoamPi: An open-source Raspberry Pi based apparatus for monitoring polyurethane foam reactions. [Project]. Open Science Framework. https://doi.org/10.17605/OSF.IO/U3295
[3] Wright, Harry (2022). FoamPi: An open-source Raspberry Pi based apparatus for monitoring polyurethane foam reactions. [Registration]. OSF Registries. https://doi.org/10.17605/OSF.IO/Q6U9S
[4] Wright, Harry C. (2022). FoamPi. [Software]. GitHub. https://github.com/HarryCWright/FoamPi
[5] Wright, Harry Charles; Lawrence, Frederick Antonio; Ryan, Anthony John et al. (2023) ‘Free and open-source software for object detection, size, and colour determination for use in plant phenotyping’. Preprint (Version 1). Research Square. https://doi.org/10.21203/rs.3.rs-2546630/v1
[6] Wright, Harry (2023). PlantColourSizer. [Project]. Open Science Framework. https://doi.org/10.17605/OSF.IO/BNJ7G
[7] Wright, Harry (2023). Free and open-source plant phenotyping. [Dataset] The University of Sheffield, ORDA. https://doi.org/10.15131/shef.data.21989561.v1
[8] Wright, Harry C. (2023). PlantSizeClr. [Software]. GitHub. https://github.com/HarryCWright/PlantSizeClr
-
Shuangke Jiang - Preregistration, sharing materials, and conducting replication studies in Psychology
-
As a psychology PhD student, I engage actively in open research by pre-registering my empirical studies and promoting transparency of my data and code on the Open Science Framework. I have also contributed to the promotion of open practices by leading the development of our departmental FAIR checklists, as well as organising a Bayesian Statistics Workshop on campus in 2022.
Applying open research practices in Psychology
My practice in open research is exemplified by my two recent empirical PhD studies: a working memory training study, and a replication study on transcranial direct current stimulation (tDCS) effects that were reported by Wang et al (2019) [1]. This case study will detail my open research practice and the ways in which these actions, which include pre-registration, openly sharing data and code, posting preprints and transparently publishing findings, benefit my PhD outputs and future research.
Pre-registration for robust research
From the outset of both studies, we pre-registered the hypotheses and detailed plans for experimental design, sampling and analysis in the Open Science Framework (OSF), an openly searchable repository (the pre-registrations for the training and the tDCS replication studies can be found by following the links below [2], [3]). Such open research practices make it possible to conduct more careful and rigorous experiments, and increase the likelihood of our findings being reproducible. In particular, pre-registration was pivotal to our tDCS replication study. The findings reported in the original study are far-reaching, providing important theoretical and practical implications in the field of cognitive psychology and neuroscience. Replicating such an important study can verify the reliability of the originally-reported effects, and test the generalizability across conditions that inevitably differ from the original study. Indeed, by pre-registering the changes (for example, a bigger sample size, larger numbers of trials, complete counterbalancing and more difficult level of task) to account for the original study’s possible limitations, we have enhanced the rigour and reproducibility in our replication attempts, and strengthened the null findings in our replication study.
Openly sharing (meta-)data and code following the FAIR principles
To facilitate and promote future replications of our findings or reuse of our materials, we made the materials FAIR - findable, accessible, interoperable and reusable. Take our training study as an example: we stored and archived the data and code on OSF to make it easy to find and access. Digital object identifiers (DOIs) and Creative Commons Attribution (CC BY) licences were attributed to the OSF project [4] and stored materials in support of the publication, as well as the preprint on PsyArXiv, a psychology-specific repository [5]. We also archived our experimental tasks and analysis code on Github to ensure good version control [6]. Finally, we made sure to include important metadata such as readme files.
The importance of sharing null findings
Open research practice has encouraged us to speak out loudly for null findings. In addition to conducting frequentist significance testing, we pre-registered Bayesian hypothesis testing, which allows the strength of evidence, including for null findings, to be evaluated. Indeed, these open science approaches, in combination, increase the confidence of an early-career researcher like myself to transparently interpret and publish null results. This is particularly illustrated in our replication study, where we found substantial evidence against the significant tDCS effects that were reported by Wang et al (2019). Furthermore, transparently reporting the findings, even null effects, has helped me to promote this work for prospective publication. We presented the tDCS replication study at the European Society for Cognitive Psychology Conference in 2022, where we demonstrated our replication attempt and the strong evidence favouring null results. This presentation led to an editor of a peer-reviewed journal inviting us to submit to their open science special collection during the conference.
Looking to the future
As a PhD student, it was challenging to learn open science from scratch. Seeing myself and colleagues’ successful endeavours on open research, I am proud to be one of the open science researchers in the community. My experiences further motivated me to engage in big open science collaborative projects like the #EEGManyLabs project to replicate Vogel & Machizawa (2004) [7], as well as an international, multi-site Open Research Area Call project on cognitive training across the lifespan. I am committed to taking these beneficial open research approaches wherever I go next, as well as continuing to spread best practice in open research.
References
[1] Wang, S., Itthipuripat, S., and Ku, Y. (2019). Electrical stimulation over human posterior parietal cortex selectively enhances the capacity of visual short-term memory. The journal of neuroscience: The official journal of the Society for Neuroscience, 39(3), pp. 528–536. https://doi.org/10.1523/JNEUROSCI.1959-18.2018
[2] Jiang, Shuangke; Jones, Myles; and von Bastian, Claudia C. (2020), ‘Mechanisms of visual working memory training: capacity and efficiency’. [Pre-registration]. Open Science Framework. https://doi.org/10.17605/OSF.IO/MK8FA[3] Jiang, Shuangke; Jones, Myles; and von Bastian, Claudia C. (2020), ‘The tDCS effect on visual working memory over DLPFC and PPC’. [Pre-registration]. Open Science Framework. https://doi.org/10.17605/OSF.IO/N9FKP
[4] Jiang, Shuangke; Jones, Myles; and von Bastian, Claudia C. (2023), ‘Mechanisms of visual working memory training: capacity and efficiency’. [Project]. Open Science Framework. https://doi.org/10.17605/OSF.IO/K5HGE
[5] Jiang, S., Jones, M., & von Bastian, C. C. (2023). Mechanisms of cognitive change: Training improves the quality but not the quantity of visual working memory representations. Preprint. PsyArXiv. https://doi.org/10.31234/osf.io/xbkcz
[6] Jiang, Shuangke, (2022), ‘Mechanisms of cognitive change: Training improves the quality but not the quantity of visual working memory representations’ [software]. Github. https://github.com/J2K101000101/wmt-beh
[7] Vogel, E., and Machizawa, M. (2004) Neural activity predicts individual differences in visual working memory capacity. Nature 428, 748–751. https://doi.org/10.1038/nature02447
-
Alice Pyne and the TopoStats team (Joseph Beton, Thomas Catley, Xinyue Chen, Mingxue Du, Tobi Firth, Max Gamill, Libby Holmes, Robert Moorehead, Alice Pyne, Eddie Rollins, Neil Shephard, Bob Turner, Billie Ward, Sylvia Whittle, Laura Wiggins) -
Developing open source software aligned with the FAIR4RS principles: The TopoStats project
-
Our team developed Topostats [1], a Python toolkit with an LGPL licence. TopoStats enables researchers to automate editing, analysis and quantification of Atomic Force Microscopy (AFM) images to determine the structure of materials at the nanoscale. This unique functionality will aid the field in moving away from manual analysis processes which have low throughput and rely on experienced researchers. Our current focus is on developing new image analysis functionality for TopoStats to accelerate the development of novel therapies and improve our understanding of health and disease.
Open source software for collaboration and reuse
By making our software open-source, we have reached new collaborators across the world in academia and industry, who are using TopoStats to quantify materials from next-generation sustainable materials such as solar cells, to new AI-designed nanostructures for drug delivery. Beyond specific functionality, we see TopoStats as a tool to drive a transformation in research culture, placing greater emphasis on open, quantitative analysis of microscopy data, and introducing strategies for standardisation and metadata capture. We ensure our data is findable and accessible with persistent identifiers and metadata (e.g. [2]), and that metadata from TopoStats’ analysis is recorded via a configuration file which is automatically saved. This has increased the impact of our research, enabling others to use our datasets and making our work reproducible.
TopoStats and FAIR4RS
Our focus for TopoStats’ development has been aligned with FAIR principles, focusing on (re)usability of the software (FAIR4RS) [3], ensuring the code is easy to read, run and maintain. Our major progress has been in the following areas:
- Establishing co-working procedures on GitHub that ensure the code is easy to maintain and develop. This is essential as we have several people actively contributing to code development and, being an open source project, contributions from outside the group. These include using project management tools to organise issues and prioritise them into milestones to facilitate clear goals for version releases.
- Adding a code of conduct and introducing templates for bug reports and feature requests.
- Simplifying the installation method to allow users to install using Python 3.0+ and with less complex requirements. This included removing outdated and difficult to install “dependencies” (old versions and software that TopoStats previously relied on), allowing us to modernise our codebase, and automate the release of new versions to the Python Package Index (PyPI) using open-source tools, lowering the barrier to usage whilst improving interoperability.
- Building an automated software test suite from the ground up. Automated tests ensure that changes to one part of the software do not break another, and that bugs do not re-occur. Such tests are a standard feature of trusted Python packages and are essential to ongoing software development.
- Automatic uploading of versions released to PyPI (Python Package Index) to ORDA, so they are associated with a persistent identifier (e.g. [4]).
- Improving the “style” of our code by adhering to the PEP8 style guide which makes it more consistent. This makes it easier for a wide range of people to interpret and contribute to TopoStats.
- Reorganising and refactoring a large part of the code to make it easier to work with. This includes introducing “classes” to group some of the functionality together and making the code more modular so fixes, extensions and new features can be developed more easily and TopoStats functionality can be used by other packages (i.e. it is interoperable).
- Improving TopoStats’ documentation, with API documentation generated automatically from the code, supplemented by specific pages on installation and usage [5]. This makes the software easier to use and improve, keeps developer documentation close to the code and avoids duplication / unnecessary documentation.
- Adding easy-to-use configuration files where users can specify sample-specific parameters for analysis without having to modify the code. This reduces the barrier to entry.
- Running open source courses and specific workshops for people interested in developing their Python skills and/or using TopoStats.
In summary, the TopoStats team has developed an automated, usable open-source tool for the wider AFM community which aligns with FAIR4RS principles. We have streamlined the installation process, improved the usability of the code, and included automated tests. The most impactful outcome is that the global scientific community are now using this software, reporting bugs, requesting features, and opening the door for new collaborations. This represents a big step forward for the AFM community, improving quantitative processing and reliability of this microscopy data around the world through open software development.
References
[1] AFM-SPM (2023). Topostats. [software]. Github. https://github.com/AFM-SPM/TopoStats
[2] Pyne, Alice; Noy, Agnes; Main, Kavit; Velasco Berrelleza, Víctor; Piperakis, Michael; A. Mitchenall, Lesley; et al (2021). Atomic force microscopy and atomistic molecular dynamics simulation data to resolve structures of negatively-supercoiled DNA minicircles at base-pair resolution. [Dataset]. Figshare. https://doi.org/10.6084/m9.figshare.13116890
[3] Barker, M., Chue Hong, N.P., Katz, D.S. et al (2022). Introducing the FAIR Principles for research software. Sci Data, 9, p. 622 (2022). https://doi.org/10.1038/s41597-022-01710-x
[4] Shephard, Neil; Whittle, Sylvia; Gamill, Max; Du, Mingxue; Pyne, Alice (2023). TopoStats - Atomic Force Microscopy image processing and analysis. [Software]. The University of Sheffield, ORDA. https://doi.org/10.15131/shef.data.22633528.v1
[5] TopoStats authors (2023). TopoStats documentation. [Web resource]. Github. https://afm-spm.github.io/TopoStats/main/index.html
- Amber Copeland - The development of a novel decision-making task in addiction research
-
Video case study
Written case study
While completing my PhD in Psychology [1], I was reliant on the transparency and openness of other people’s methods to achieve my goal of creating a novel decision-making task that can capture behavioural data required to fit computational models in addiction research. Early in the process, however, I experienced several challenges that were exacerbated by a lack of uptake in open research practices within the existing literature. For example, after systematically reviewing existing studies, we found that less than 2% of experimental studies on alcohol cognition were pre-registered [2], and that most studies that employ decision-making tasks were vague in the reporting of their methodology [3]. This meant that I spent a great deal of time contacting authors to ask for clarification or their materials, which frustratingly at times went unanswered.
Integrating open research practices in addiction research
As a result of my experience, I took extensive steps to integrate open research practices into my own research during the completion of my PhD, ranging the spectrum of planning and preparation right through to dissemination and publication. This is evidenced by the pre-registration of all my study designs, methods, analysis plans, and hypotheses prior to the collection of any data (for an example, see [4]), which are publicly available. This reflects my belief that pre-registration does not entail any more work; rather, it front-loads the work in a way that saves time later once the data is collected. An additional benefit is that this step minimises researchers' degrees of freedom and practices such as ‘HARKing’ and ‘p-hacking’ which increase the risk of type I errors (false positive conclusions) and thus threaten the replicability of research findings.
Regarding dissemination, I have summarised my research for general audiences (for example, in blog posts; see [5], which has been read >4000 times) and prior to submitting work for publication I upload the findings to preprint servers that disseminate the research rapidly and openly [6]. I have published in several academic journals [3, 7, 8, 9], of which many are open access and accompanied by open science badges to promote the openness of the research, including the availability of all anonymised data and data preparation and analysis scripts on public repositories (see [10]). Furthermore, all my publications are accompanied by large supplementary files that include thorough details of methodology, eliminating the requirement that people rely on contacting me for further detail or clarification.
Two core instances where my adoption of open research practices has been explicitly acknowledged are firstly, when code that I openly shared was used in another addiction research study that applied computational modelling techniques (see line 315 in [11] for direct acknowledgement of this). I find this particularly rewarding because when I started implementing these techniques in my PhD research, there was very little existing guidance, and therefore I have contributed to making these computational methods more accessible. Furthermore, upon checking the lead author's code, I was able to identify and subsequently help rectify an error, thus contributing to the integrity and reliability of published findings. Secondly, during the peer review process, the editor of a journal thanked me for sharing all data (raw and processed) and I was later awarded with a ‘Research Spotlight’ award for exceptional contribution to the journal.
Reflections on implementing open research practices
Implementing open science practices during the completion of my PhD was not without its challenges. Learning to code in the R programming language was challenging to begin with and took considerable time. Furthermore, it can feel intimidating to share data and code so openly - especially raw data. However, this brings about substantial benefits because it permits scrutiny and replication of the research. Overall, my adoption of open science practices has been a learning curve and I am enjoying becoming a more transparent and rigorous researcher, as well as encouraging others to do so.
What next?
The next step is to create a permanent link and Digital Object Identifier (DOI) for the decision-making task that I created during my PhD (for example, via Pavlovia) so that it is citable and open to additional modification, as well as providing an exciting opportunity for collaborative research projects.
References
[1] Copeland, Amber (2022). An application of value-based decision-making (VBDM) to the study of addiction and recovery from it. PhD thesis. University of Sheffield. White Rose eTheses Online. https://etheses.whiterose.ac.uk/31874/
[2] Pennington, C.R. et al (2021). Raising the bar: improving methodological rigour in cognitive alcohol research. Addiction, 116(11), pp. 3243–3251. https://doi.org/10.1111/add.15563
[3] Copeland, A., Stafford, T. and Field, M. (2022). Methodological issues with value-based decision-making (VBDM) tasks: The effect of trial wording on evidence accumulation outputs from the EZ drift-diffusion model. Cogent psychology, 9(1). https://doi.org/10.1080/23311908.2022.2079801
[4] Copeland, Amber et al (2020). Decision-making in regular alcohol consumers and people who have cut-down. [Pre-registration]. AsPredicted. https://aspredicted.org/dh7vp.pdf
[5] Copeland, Amber (2020), Why Do People Who Feel That Their Life Is Meaningful Drink Less Alcohol? Blogpost. Psychreg. https://www.psychreg.org/meaningful-life-drinking-less-alcohol/
[6] Copeland, A., Stafford, T., & Field, M. (2022). Recovery from nicotine addiction: A diffusion model decomposition of value-based decision-making in current smokers and ex-smokers. Preprint. PsyArXiv. https://doi.org/10.31234/osf.io/3jrze
[7] Copeland, A., Stafford, T., Acuff, S. F., Murphy, J. G., & Field, M. (2023). Behavioral economic and value-based decision-making constructs that discriminate current heavy drinkers versus people who reduced their drinking without treatment. Psychology of Addictive Behaviors, 37(1), 132–143. https://doi.org/10.1037/adb0000873
[8] Copeland, A., Jones, A. and Field, M. (2020). The association between meaning in life and harmful drinking is mediated by individual differences in self-control and alcohol value. Addictive behaviors reports, 11, pp. 100258–100258. https://doi.org/10.1016/j.abrep.2020.100258
[9] Copeland, A. et al (2023). Meaning in life: investigating protective and risk factors for harmful alcohol consumption. Addiction research & theory, 31(3), pp. 191–200.
https://doi.org/10.1080/16066359.2022.2134991
[10] Copeland, Amber; Stafford, Tom; and Field, Matt (2022). Methodological issues with VBDM tasks. [Data and code]. ResearchBox. https://researchbox.org/505
[11] Dora, Jonas (2022). R code to data analysis: Modeling the value-based decision to consume alcohol in response to emotional experiences. [Software]. Open Science Framework. https://osf.io/8khyg
- Alicia Forsberg - Applying open research practices within the field of cognitive developmental psychology
-
Pre-registration and registered reports
Since my PhD, I have engaged with open research in a variety of ways. Firstly, most of my studies have been pre-registered (methods, hypotheses, and analyses), and research materials and data have been shared, allowing exact replications and additional analyses; see my Open Science Framework page. I was also fortunate to collaborate on an adversarial project, which compared three competing theories of Working Memory [1]. Research designs were agreed upon and three competing predictions were pre-registered, to prevent HARK-ing (Hypothesising-After-Results-are-Known). Recently, I published a registered report [2], for which I submitted the introduction, methods, and analysis plan to the journal for peer review before data collection.
Community activities and teaching
As a postdoc at the University of Missouri, I started and co-led a local ReproducibliTEA journal club, involving early-career researchers in regular discussions about open research, and inviting Open Research speakers to visit our university. Based on these projects, I was invited to contribute to the British Psychology Society’s Developmental Newsletter [3]. Currently, I am setting up a network and an international online journal club for developmental researchers interested in open research, to share good practice. Interest has been excellent, with our first meeting taking place in the summer of 2023. Finally, I am designing a new Level 3 module, aiming to support students in critically evaluating psychological research; Open Research principles are central to this module.
Motivation
I am a strong believer in the power of open research to improve rigour and replicability, but for various reasons, open research principles have been slower to gain momentum in my research field of cognitive developmental psychology. Indeed, developmental research is notoriously messy. My research involves asking children as young as six years old to remember repetitive sequences of coloured shapes. Finding participants is difficult, and, as you might imagine, data quality can be compromised by various unpredictable factors. For example, sometimes a child might decide to sing loudly throughout the session, or a parent may decide to whisper the answers. I wanted to show other developmental scientists that despite these challenges, pre-registration and registered reports are not only possible but great tools to improve research rigour and quality in our field.
What I learnt from these experiences
When leading the registered report as a brand-new postdoc in a new research group, I learnt a lot about both the theoretical position and practical concerns of my collaborators. With this knowledge, we were able to strengthen our study design, and we are now collectively very proud of this project. Reviewer feedback was also incredibly useful – instead of spending time justifying our study design, we were able to simply incorporate their helpful suggestions prior to data collection. More broadly, I have learned a lot from both doing and talking about open research with various colleagues. Firstly, colleagues – both junior and senior – have shared their perceived barriers to engaging with open research. Some were worried that being tied to a specific plan might prevent publication if something unexpected occurred. Some were concerned that others might re-analyse their data and that this might impact their chances of getting a permanent job. When sharing my experiences with open research, I have addressed these concerns and highlighted how pre-registration still allows reasonable flexibility and how open research can be incredibly valuable for career progression.
Challenges & Impact
In my newsletter piece [3], I reflected on the specific concerns that developmental scientists may face when engaging with open research principles and shared some of the challenges that I have experienced myself. For example, in these projects, despite our best attempts, children still needed to be excluded for reasons we had not expected, and some inclusion criteria needed to be adjusted. By being honest about these challenges and explaining how we transparently proceeded with our project, I have supported a perception of these practices not as a stick to beat researchers with, or something which requires perfection, but instead, as a set of tools supporting rigour – especially helpful given the unpredictable nature of our research.
I feel the biggest impact of my open research activities has been in sharing my experiences to demonstrate that developmental data – with all its unexpected challenges – is not incompatible with open research. Several colleagues have asked for my advice as they have started to adopt open research practices, which highlights that my advocacy for inclusive and accepting open research is making an impact in my field.
References
[1] Cowan, N., Belletier, C., Doherty, J. M., Jaroslawska, A. J., Rhodes, S., Forsberg, A., et al (2020). How do scientific views change? Notes from an extended adversarial collaboration. Perspectives on Psychological Science, 15(4), 1011-1025. https://doi.org/10.1177/1745691620906415
[2] Forsberg, A., Adams, E. J., & Cowan, N. (2023). Why does visual working memory ability improve with age: More objects, more feature detail, or both? A registered report. Developmental science, 26(2), e13283. https://doi.org/10.1111/desc.13283
[3] Forsberg, A., (2022). Pre-registration in developmental psychology: Benefits and challenges. Developmental Psychology Forum. https://doi.org/10.53841/bpsdev.2022.1.95.4 . [For access, see: White Rose Research Online; https://eprints.whiterose.ac.uk/200852/]
-
Ana Méndez de Andés - Using the Open Science Framework to share research data and contextual materials for a study on urban commons
-
Video case study
Written case study
This case study presents an online project which sought to make publicly available under a Creative Commons License the design, process and products of a qualitative research project that analyses the potential ‘becoming-common’ of the public, promoted by municipalist local governments in Spain. This work is part of a PhD project committed to producing knowledge as commons, and engaging with an emerging community of researcher-practitioners involved in creating actionable knowledge applicable to urban commons, planning and municipalist practices. The aim to make research more accessible, open and collaborative aligns with the project's framework as practice-based activist research, where knowledge production is considered a collective endeavour rather than the product of individual thinking.
Preparing to share data and outputs
The possibility of sharing the research outcomes and process was included in the project’s ethics application, which mentions the possibility to ‘record the permission related to the publications of the research data and outputs (metadata and reports), [which] will be published under a Creative Commons Attribution – ShareAlike 4.0 International licence [and] ask the participants if they agree that their data could be shared in a repository to be used in future research projects’. Furthermore, the project’s Data Management Plan states that ‘once each analysis phase is completed, the researcher will publish partial outcomes, researchers’ analysis and relevant raw data – only where written or oral consent has been obtained – in a publicly online accessed site.’
Selecting an appropriate repository
After researching available platforms, I decided to create a project in the Open Science Framework (OSF) platform [1]. OSF allows the creation of nested components, categorised as analysis, communication, data, method or project, among other categories; it provides storage for files; and it is possible to activate a wiki section, to encourage collaboration. Each component can incorporate different contributors, files have metadata information - related to title, document type and language - and tags are included to enhance findability. Another interesting feature for me was the ability to create DOIs for each component and assign a Creative Commons Attribution 4.0 International licence, unless another type is specified.
These features allowed me to create a) an archive of research documentation; b) a data, metadata and analysis archive; and c) a repository of partial outcomes and dissemination activities. Research makes use of public documentation and individual and collective contributions from social and institutional practitioners through interviews and workshops. This information is organised in three configurations. First: documents related to the project development, such as grant and ethics applications, data management plan, or project timeline. Second: data - public documents, interviews and workshop outputs - and analysis. Third: public seminars and publications in academic and non-academic environments. These different aspects allow me to present ‘data’ as part of a broader environment of academic institutional procedures – such as the ethics application or the risk assessment – and of social interactions. The aim of this ‘thick description’ [2] of the situations in which data has been envisioned, created, and analysed is to locate a wider research context that includes superseded and discarded lines of thought and shows the necessary adjustments to external factors, such as COVID-19.
The importance of contextualising data
This experience acknowledges the open qualitative research challenges presented by Class et al [3] regarding the complexity of sharing situated knowledge, thoughts and reflections that need to be contextualised. In this context, the OSF platform helps researchers to go beyond the question of replicability and offers the opportunity to contextualise qualitative data, while contributing to the different aspects of the Open Science ‘schools of thought’ [4]: namely, to provide access to research, make data and data analysis open and reachable, and foster collaboration and innovation. For wider public dissemination, however, the interface access, features and design are very much directed at academic research.
Impact and challenges
So far, the project has helped disseminate open research practices among PGR researchers who have had the opportunity to access the project’s documentation, incorporating transparency and accessibility to enable informal exchanges of practical knowledge. It also introduced to social activists the idea of reusable data and replicable analysis. This aspect is especially relevant to communities and experiences, such as the PAH (Platform for People Affected by Mortgages), in Barcelona, that attract great academic interest, and where members spend significant time attending to research requests. These social movements might benefit from the opportunity to make their contributions available so that they can refer to previous interviews, enabling subsequent researchers to build new questions based on what social actors have already said.
In developing this project, I have experienced some of the usual difficulties of qualitative research. First, most qualitative research outcomes are not computational, and are highly situational. This means that the question of data and analysis reuse and replicability has to take into consideration other parameters such as trust and relation, besides the more quantitative aspects of findability, accessibility and interoperability, which affects the selection of formats and information structure. Second, accessibility and legibility in qualitative research is labour intensive. To edit and prepare interview transcriptions, or to provide contextual information, for example, has proven to be quite time-consuming. As a general reflection, I would say that open data and FAIR principles offer a robust general framework, but open research practices need to be tailored when applied to the complexities of qualitative, situated, practice-based research.
References
[1] Mendez de Andes Aldama, Ana, Becoming-common of the public. Municipalist counter-planning strategies. [Project]. Open Science Framework. https://doi.org/10.17605/OSF.IO/NBP4R
[2] Gibson-Graham, J. K. (2014). Rethinking the economy with thick description and weak theory. Current Anthropology, 55(S9), pp. S147–S153. https://doi.org/10.1086/676646
[3] Class, B., de Bruyne, M., Wuillemin, C., Donzé, D. & Claivaz, J.-B. (2021). Towards open science for the qualitative researcher: From a positivist to an open interpretation. International Journal of Qualitative Methods, 20, p. 160940692110346. https://doi.org/10.1177/16094069211034641.
[4] Fecher, B. & Friesike, S. (2014). Open science: One term, five schools of thought, in Bartling, S. and Friesike, S. (eds.), Opening science: The evolving guide on how the internet is changing research, collaboration and scholarly publishing, pp. 17–47. Cham: Springer Nature.
- Sina Tabakhi - Developing a FAIR-compliant open-source feature selection tool: UniFeat
-
I engage actively with open research through my research and community activities. One of the notable contributions I have made recently is the development of the Universal Feature Selection Tool (UniFeat), which is an open-source tool entirely developed in Java for performing feature selection processes in various research areas. UniFeat provides a set of well-known and advanced feature selection methods within a unified framework. This allows researchers to compare the performance of different feature selection methods and select the best method for their research. Moreover, UniFeat is a benchmark tool because it includes methods in all approaches to feature selection, making it comprehensive and beneficial for the research community.
Sharing code for collaboration and reuse
To ensure that UniFeat adheres to FAIR4RS (findable, accessible, interoperable and reusable principles for research software), I released UniFeat on GitHub [1], which allows users to access the tool's source code, study use cases, contribute to the development of the tool, and report issues or bugs. UniFeat was then archived in Zenodo [2], ensuring its long-term preservation and accessibility, while promoting discoverability, open accessibility, and interoperability for reuse and collaboration. Moreover, I developed a well-documented tutorial for end-users [3]. I have also made the tool open to new contributors, providing them with full developer documentation for easy participation and extension [3]. UniFeat has been implemented entirely in Java, enabling it to run on various platforms. This implementation enables researchers to utilise UniFeat through its graphical user interface (GUI) or as a library in their Java code (additional details can be found on the UniFeat website [4]).
One of the motivations for developing UniFeat as an open-source tool was to facilitate the rapid development of new feature selection algorithms. The open-source nature of UniFeat enables researchers to use and modify it to meet their research requirements and share their methodologies with the scientific community. In addition, UniFeat includes essential auxiliary tools for performance evaluation, result visualisation, and statistical analysis. I published an overview of the UniFeat tool on the preprint server ArXiv [5] to make it more findable and accessible to potential users. I have also published an Original Software Publication (OSP) in Neurocomputing [6] to contribute to the scientific communication ecosystem by making UniFeat findable, indexable, archivable, searchable, citable, and referable.
Challenges in developing an open-source tool
Developing UniFeat as an open-source tool was not without challenges. One of the significant challenges was ensuring the clarity of the software design to enable easy modification and extension of the software. I overcame this challenge by thoroughly documenting the code and providing an API reference to help other developers to understand the code.
In conclusion, my engagement with open research through the development of UniFeat and other community activities has had a significant impact. UniFeat has provided a benchmark tool for feature selection methods, facilitating the rapid development of new feature selection algorithms. The open-source nature of UniFeat also encourages the sharing of methodologies with the scientific community. Through this experience, I have learned the importance of clear documentation, software design, and supporting community-driven initiatives.
References
[1] Tabakhi, Sina; and Moradi, Parham (2022). Universal Feature Selection Tool, 0.1.1. [Software]. GitHub. https://github.com/UniFeat/unifeat/releases/tag/v0.1.1.
[2] Tabakhi, Sina; and Moradi, Parham (2022). UniFeat: v0.1.1. [Software]. Zenodo. https://doi.org/10.5281/zenodo.8046034.
[3] Tabakhi, Sina; and Moradi, Parham (2022), Universal Feature Selection Tool User Manual. [User manual]. GitHub. https://unifeat.github.io/docs/user_manual_v1.1.pdf
[4] UniFeat (2023). Universal Feature Selection Tool. [Website].https://unifeat.github.io
[5] Tabakhi, Sina; and Moradi, Parham (2022), Universal Feature Selection Tool (UniFeat): An open-source tool for dimensionality reduction. ArXiv. https://arxiv.org/abs/2211.16846
[6] Tabakhi, S. and Moradi, P. (2023) ‘Universal feature selection tool (UniFeat): An open-source tool for dimensionality reduction’. [Software]. Neurocomputing, 535, pp. 156–165. https://doi.org/10.1016/j.neucom.2023.03.037.
- Andy Tattersall - Conducting research into the prevalence of Wikipedia citation of open access sources
-
I have long championed Open Research within my department and continue to invest time with colleagues centrally to explore how we carry out and share research, which is especially important for a department like ScHARR. The research at ScHARR not only has a huge societal impact, it also informs national and international policy. Each and every one of us has a vested interest in health research, especially during the Covid and post-Covid periods. Making our outputs open and discoverable not only enhances the profile of the University, it also assists those who wish to access our research - not just academics or students, but also members of society.
Open access and Wikipedia-cited research
My specific projects around open research include a piece of work I published in 2022 relating to published research across the White Rose universities [1]. I noted through the use of Altmetric.com that a lot of Sheffield’s research was being cited on Wikipedia, which led me to the question ‘how many of these citations lead to an Open Access version?’ I believe this is incredibly important given how highly ranked and respected Wikipedia is globally and the fact that it is open to all, especially those in deprived and low-income regions.
My interest in Altmetrics (alternative indicators of scholarly interest) dates back to 2012 and in 2016 I authored and edited one of the first books on the subject [2]. After agreement from the publisher Facet, I deposited Open Access, author-version copies of my chapters into the White Rose Research Online repository to make these more accessible.
During my Wikipedia work, I sourced data from Altmetric which captured research published across the White Rose institutions in relation to how many had Wikipedia citations. I wanted to explore which disciplines across the three White Rose universities (Sheffield, Leeds and York) received the most attention on the open knowledge sharing platform. From there, I employed the Unpaywall Simple Query Tool to ascertain which of these citations were Open Access by using the DOIs from the research papers. I recruited colleagues working in scholarly communications with expertise in open access and licensing across the three institutions to assist with data analysis. We found that there were 6,454 citations of the White Rose universities’ research, published from 1922 to 2019, on Wikipedia. We also wanted to explore the origins of these citations and how many originated from or were connected to repositories. We found that 773 citations were linked to an Open Access repository. We published the paper in the fully open access UKSG-hosted journal Insights and uploaded the data to ORDA [3].
Data sharing and Wikipedia edit-a-thons
The data was hosted before the paper was published, and I wrote an article on the research for the LSE Impact of Social Science blog [4]. This then led me to collaborate with James Pearson, a learning technologist from the Faculty of Arts and Humanities, to host two Wikipedia edit-a-thons. One of these workshops took place as part of the inaugural OpenFest in 2022. I also recently collaborated with colleagues from White Rose University Libraries on holding an online Wikidata event hosted by the University of York.
References:
[1] Tattersall, A. et al (2022). Exploring open access coverage of Wikipedia-cited research across the White Rose universities. Insights: The UKSG Journal, 35, pp. 1–13. https://doi.org/10.1629/UKSG.559 .
[2] Tattersall, A. (2016) (editor). Altmetrics : a practical guide for librarians, researchers and academics. London: Facet Publishing.
[3] Tattersall, Andrew; O'Neill, Kate; Carroll, Christopher; Sheppard, Nick; Blake, Thom (2020). Exploring open access coverage of Wikipedia cited research across the White Rose Universities. Altmetric.com and Unpaywall datasets. [Dataset]. The University of Sheffield, ORDA. https://doi.org/10.15131/shef.data.12097797.v1
[4] Tattersall, Andy (2022). Wikipedia is open to all, the research underpinning it should be too. Blogpost. LSE Impact Blog. https://blogs.lse.ac.uk/impactofsocialsciences/2022/02/21/wikipedia-is-open-to-all-the-research-underpinning-it-should-be-too/
Winners and runners-up of the Open Research Prize 2021
These case studies are from the winners and runners up of the University’s inaugural Open Research Prize in Spring 2021. The winner was Tim Craggs of the Department of Chemistry, whose research group developed a fully open-source smFRET instrument with applications spanning disciplines across biophysics, biology and biomedicine.
The case studies are also available in PDF format in the University of Sheffield data repository, ORDA: https://doi.org/10.15131/shef.data.c.5621626
- Tim Craggs - Developing a new open-source instrument
-
The Craggs Lab in the Department of Chemistry have developed a fully open-source instrument for Single-molecule Förster Resonance Energy Transfer (smFRET) measurements, a powerful technique with applications spanning many disciplines across biophysics, biology and biomedicine.
Despite the many advantages of smFRET, which allows scientists to make measurements on a molecule by molecule basis, it is not widely used outside specialist labs. This is largely due to the high costs of commercial instruments and lack of self-build alternatives. To address this, we published a paper in Nature Communications [1] including detailed build instructions, parts lists and open-source acquisition software for a new instrument: the smfBox. This would enable a broad range of scientists to perform confocal smFRET experiments on a validated, self-built, robust and economic instrument. The paper has already received more than 6,000 downloads and has generated a large amount of media attention, with coverage by 13 news outlets and an Altmetric score of 220.
Applying open research in biophysics
Through our publication and the linked GitHub site, we provided everything needed to build and run the instrument, from hardware schematics to open-source software. We also provided open-source software for complete analysis of the data, in the form of a series of Jupyter notebooks. This allows other scientists to interact with and modify our datasets, which have a permanent DOI and include both data and analysis. This ‘open analysis’ approach allows complete transparency in the data we publish. Anyone can reproduce our analysis and figures, or alter analysis parameters to see the effects on the data, thereby establishing for themselves its robustness and any limitations.
As champions of open science and an elected member of the scientific advisory board to the international FRET community, we have encouraged the adoption of a standardised file format - HDF5 - for saving raw data. The HDF5 file architecture is FAIR-compliant, machine-readable and stores all required metadata alongside raw experimental data in a single file. The sfmBox automatically saves metadata and raw data in the HDF5 file format, allowing users to analyse their data with a range of compatible software solutions, including our own open analysis Jupyter notebooks. We are now working with other researchers to help spread the use of this file format.
Open science is at its best when others can easily take advantage of the progress made, and we have promoted this through video demonstrations of our instrumentation. In the Journal of Visual Experiment, we offer a step-by-step video protocol [2] for using the sfmBox to make accurate single-molecule FRET measurements. To increase access to this article, we have funded the full open access charge from our own consultancy funds (as UKRI pots are ineligible for this – a situation that ought to change).
Looking to the future
Open research is not about one thing, it is a way of life in the Craggs Lab, with the overall aim of making our science available, understandable and usable by the largest possible number of people. As a result of our open research, smfBox is currently being built by at least 5 other labs, in the US, Denmark and South Korea. We are also establishing a spinout company to produce a version of the instrument and software for sale and distribution to labs around the world. This company has received its first pre-seed investment funding, proving that open research can also lead to commercial opportunities.
Our ethos of open research includes many activities, from publishing all of our papers on relevant preprint servers [3] and the new approach of open-hardware instrumentation - in which we have been recognised as early leaders [4] - to establishing open data and analysis through encouraging standardised file formats for our field. Only through this multifaceted approach can open research realise its promise.
Our open research
Build instructions and parts list for smfBox made openly available Open-source software provided in form of Jupyter notebooks Step-by-step video protocol made available open access.References
[1] Ambrose, B. et al. (2020). The smfBox is an open-source platform for single-molecule FRET. Nature Communications 11: 5641. https://doi.org/10.1038/s41467-020-19468-4
[2] Abdelhamid, M. et al. (2021). Making precise and accurate single-molecule FRET measurements using the open-source smfBox. JoVE 173: e62378. https://doi.org/10.3791/62378
[3] Craggs, T. et al. (2018). Substrate conformational dynamics drive structure-specific recognition of gapped DNA by DNA polymerase [Preprint]. bioRxiv. https://doi.org/10.1101/263038
[4] Fantner, G. and Oates, A. (2021). Instruments of change for academic tool development. Nature Physics 17: 421–424. https://doi.org/10.1038/s41567-021-01221-3
- Paul Schneider and Robert Smith - Creating new data with parkrun
-
Paul Schneider and Robert Smith are PhD students in the Wellcome Trust DTC for Public Health, Economics & Decision Science. They found their first-year research attachment with ‘parkrun UK’ so interesting, they continued to work on it even after they moved on to their PhD research. They tell us more about their work with the organisation that has encouraged thousands of people across the UK to take up the weekly challenge of a 5km run or walk.
An interesting opportunity for a research attachment arose when, in December 2018, parkrun received funding from Sport England to set up 200 additional events. The aim of this was to further increase participation, particularly from deprived communities. We established a collaboration with parkrun UK and helped them to better understand the current disparities in access to and participation in parkrun events in England. This involved the development of a geospatial optimisation algorithm which provided recommendations for the best parks and green spaces in which to establish new parkrun events.
Integrating openness into public health research
Throughout the project, we tried to make our research as transparent and accessible as possible. As this was initially planned as a short-term project adjacent to our PhDs, we wanted to ensure that other researchers could use the wealth of data provided by parkrun UK. We also wanted to enable researchers in the 22 countries where parkrun is currently active to reproduce and refine our methods.
Our research resulted in multiple research outputs, including an interactive map that shows existing parkrun events and recommended locations for future events. Since recommended locations were not always suitable to host running events, the map proved useful for parkrun UK in allowing them to identify alternative locations in close proximity. Our work also informed parkrun UK’s broader strategy for making their running events more inclusive, as illustrated by a 2020 press release:
Decisions about where to locate events have been informed by Rob’s expertise and insight, as have efforts to grow participation at those events once they have been established […] One example of how the statistical tool was used is the creation of Bowling Park parkrun, located in a deprived area of Bradford. Our local Ambassador, working with community groups, identified the location as an option for a parkrun event – which was corroborated by Rob’s work – and the event became a reality for the local people. [1]
Several open access publications resulted from the project [2,3,4], one of which is available in the Wellcome Open Research platform, which has an open peer-review process and staged version history. Preliminary results were also made available on preprint server medRxiv and promoted on social media to invite feedback. This led to an eagle-eyed reader spotting that their parkrun was missing from the map and informing us via the parkrun Facebook group. The bug was subsequently fixed, the interactive map updated and the paper corrected before submission to the journal. Rob also promoted our research to the wider public when he took part in Nicola Forwood and Danny Norman’s popular With Me Now parkrun podcast.
Looking to the future
To ensure our research follows the FAIR principles and is reusable by others in the future, we have made our research as accessible as possible. All of our data have been made openly available in Zenodo [5,6] and GitHub, alongside an annotated version of the source code used to generate the results, meaning that others can replicate our findings. The source code was also submitted to two Repo-Hacks - day-long hackathons where researchers from different fields meet and try to reproduce the published research of others. Our study was successfully replicated and we received some useful feedback, enabling us to make further improvements, and we have since built on the work in a subsequent open access publication [3] which provides data over a 10 year period. Rob has also given talks alongside representatives from the Wellcome Trust as part of their effort to encourage other researchers to make their research more open.
Our open research
- Preliminary results shared on preprint server, leading to a reader correction of data
- Publication of research output on the Wellcome Open Research platform
- Data and code openly available in GitHub and Zenodo.
References
[1] Using research to improve inclusivity. Parkrun UK[online]. 8 December 2020. https://blog.parkrun.com/uk/2020/12/08/using-research-to-improve-inclusivity
[2] Schneider, P. et al. (2020). Multiple deprivation and geographic distance to community physical activity events —achieving equitable access to parkrun in England. Public Health 189: 48-53. https://doi.org/10.1016/j.puhe.2020.09.002
[3] Smith, R. (2020). RobertASmith/DoPE_Public: Determinants of parkrun Engagement v1.0. [Dataset] Zenodo. https://doi.org/10.5281/zenodo.3596841
[4] Schneider, P. (2020). Code and Data Repository for: Multiple deprivation and geographic distanceto community sport events —achieving equitable access to parkrun in England. [Dataset] Zenodo. https://doi.org/10.5281/zenodo.3866143
- Tom Webb - Climate change and marine ecology
-
Tom Webb of the Department of Animal and Plant Sciences has benefited from open research practices over the course of his career. In his project on the effect of increased environmental temperatures on marine species, Tom looks at how he is improving his own research practices.
As a macroecologist and biodiversity scientist, I am dependent on other people’s data in my quest to better understand how marine life is changing in the anthropocene. I have experienced the frustration of key datasets being unavailable or untidy, of broken links and code lacking documentation. I also understand the need to incentivise data providers, and follow best practice in acknowledging their efforts through citation and collaboration. As a data user, I also feel responsible for the ongoing process of improving my own research practices.
Applying open research in marine ecology
My progress towards open research is exemplified by my recent project on the thermal limits of marine species. It’s a study built on open data, developed using open-source software, with results published in an open access journal. It also encapsulates my belief that we must make the best use of existing data in our efforts to address the biodiversity and climate crises: collating, linking, remixing and enriching openly available data to ask and - eventually - answer novel questions.
I started with the aim of quantifying the thermal tolerances of marine species. Sometimes these limits have been determined experimentally, but the logistics, expense and ethics of this approach mean only a few hundred marine species will ever be assessed in this way. However, we do have access to over 78 million occurrence records for more than 150,000 species through the Ocean Biodiversity Information System (OBIS), as well as large open datasets of sea temperature. We wanted to find out if these existing datasets could be linked together to summarise the temperatures in which different marine species have been recorded, so we matched large samples of this data with openly available sea temperature datasets. Our estimates of thermal tolerance were shown to match experimental results well, demonstrating that it is possible to use open data to obtain accurate assessments of thermal tolerances, a vital indicator for predicting changes in distribution under climate change.
As well as using openly available data, thereby reducing the need to expend resources on collecting new data, we wanted to make our own research as widely accessible as possible. A good illustration of this is the Data Availability Statement in our article published in Ecology and Evolution (2020):
'A major aim of this work is to make the tools required to replicate, adapt, and extend the methods presented freely available to the community. Our work uses existing publicly available data, and we show users how to access the same data from within the open source statistical environment R. Processed datasets and code for analysis and visualization are available via GitHub and are also deposited in Figshare [1] via the University of Sheffield's Online Research Data repository.’ [2]
This model of open data has become my preferred workflow, with data processed and analysed using open-source tools, archived in ORDA, and documented with an extensive readme serving as both documentation and tutorial. (There are more examples of our open data in GitHub and ORDA [3].)
Looking to the future
As part of our research, we created a data product [4] for the European Marine Observation and Data Network (EMODnet), showing how to derive, summarise and visualise thermal affinities for European marine species. Building on this, I am delighted to be leading the data products development team for Phase IV of EMODnet Biology (2021-23), a fantastic opportunity to work with scientists, data professionals, and research software engineers to make useful, accessible products to help shape international marine policy.
As someone of moderate technical ability, I have found the process of improving my research practice challenging, and it has taken me too long to adopt some aspects of good practice. But I am now embedding open research into the culture of my research group, and I am proud that the scientists I am training - including my early career co-authors [2] - are taking these principles with them wherever they go next.
Our open research
- Existing open biodiversity & climate data used extensively; newly generated data & code published
- Open data placed in repositories with documentation to maximise accessibility & usability
- Open research incorporated into research group training to help spread best practices
References
[1] Webb, T. (2020). Data and code for Occupancy-derived thermal affinities reflect known physiological thermal limits of marine species. [Dataset] University of Sheffield, Figshare. https://doi.org/10.15131/shef.data.12249686.v1
[2] Webb, T. et al. (2020). Occupancy-derived thermal affinities reflect known physiological thermal limits of marine species. Ecology and Evolution 10(14): 7050– 7061. https://doi.org/10.1002/ece3.6407
[3] Webb, T. (2020). Linking dimensions of data on global marine animal diversity. [Software] University of Sheffield, Figshare. https://doi.org/10.15131/shef.data.12833891.v1
[4] Webb, T. and Lines, A. (2018). Thermal affinities for European marine species. [Dataset] Marine Data Archive. https://doi.org/10.14284/378
- Robert Shaw - Creating new open-source software in computational chemistry
-
Postdoctoral researcher Robert Shaw and colleagues in the Department of Chemistry have successfully engaged with open research through the development of open-source software project libecpint.
Computational chemistry is increasingly used to guide and interpret experiments, as well as develop and test underlying theories of how chemistry happens. An important example is the modelling of systems containing heavier elements of the periodic table. These play vital roles in a range of areas, including improving sustainability of the chemical industry, producing new smart materials, and the nuclear fuel cycle. However, the ‘effective core potentials’ required for accurate and efficient calculations on these systems are typically only available in proprietary software.
Applying open research to computational chemistry
When we looked at creating a piece of open-source software in this area, we knew that it needed to be reusable and reproducible across various software packages. We therefore set about developing an open-source library, using sustainable software development practices, that would provide effective core potential functionality to other programs. Improving the approaches used in these calculations would also greatly reduce computational expense.
The novel algorithms we developed led to speed-ups of up to forty times over existing literature approaches, and we realised our implementation may also be beneficial in commercial computational chemistry packages. With this in mind, we released the code under the MIT open-source licence, allowing code reuse for open-source or proprietary projects. The next step in making the code accessible to the community was online hosting with version control on GitHub. In order to make the project sustainable, we wanted to ensure that others could contribute easily and meaningfully, so we added a number of helpful features. These included documentation for users and developers, a code of conduct and ‘architecture statement’ for contributors, and continuous integration to help find and correct errors before they caused problems for the project. Some of these valuable additions arose from engaging in the Journal of Open Source Software (JOSS) open peer review process.
The software library itself is an open research output and has been assigned a Zenodo DOI [1], making the software citable and helping to attract additional contributors. We have also produced several open access, peer-reviewed articles [2], including the article published in JOSS [3], which provides a statement of need for the software and its functionalities.
Another motivation for ensuring the openness of our research was alignment to the FAIR principles, and we aimed to make the algorithms required for calculations both accessible and reusable. We feel it is particularly important that a ‘reference implementation’ such as libecpint is open, free and meets community standards for sustainable software, as one of its primary purposes is to help in the creation of other software implementations. Our open approach has therefore enabled users to adapt and further develop the work themselves.
Looking to the future
The impact of making this research and the resulting software open is that it has now been incorporated into at least four computational chemistry packages. These include the commercial/free for academic use package Entos and the open-source packages QCSerenity, VOTCA and Psi4. Notably, the inclusion in the popular Psi4 package led to my contributing to, and becoming a named author of, Psi4 and its corresponding journal article [4].
It was surprising, but positive, to find there was a larger demand for the software than had been anticipated. However, this has been something of a double-edged sword; while other interested researchers have contributed code that improves the software, there have also been requests for additional features or changes that have led to extra work. There have also been difficulties in navigating various sustainable software technologies, such as code inspectors and continuous integration, with limited expertise, time and resources.
Funding or recognition for ensuring that scientific code is open and sustainable has been incredibly limited in the past, and it is very pleasing that the scientific community is making large strides to address the culture of irreducibility. On a personal note, the positive aspects of the time and effort invested are an increase in important skills and the knowledge that the software will be usable, and improvable, by the community for years to come.
Our open research
- Open-source software accessible to contributors for further development
- Publication in open access journal with open peer review
- Software subsequently incorporated into other open-source packages
References
[1] Shaw, R. and Hill, J. (2021). Libecpint. [Software]. Zenodo. https://doi.org/10.5281/zenodo.4694353
[2] Shaw, R. and Hill, J. (2017). Prescreening and efficiency in the evaluation of integrals over ab initio effective core potentials. The Journal of Chemical Physics 147(7): 074108. https://doi.org/10.1063/1.4986887
[3] Shaw, R. and Hill, J. (2021). libecpint: A C++ library for the efficient evaluation of integrals over effective core potentials. Journal of Open Source Software 6(60): 3039. https://doi.org/10.21105/joss.03039
[4] Smith, D. et al. (2020). PSI4 1.4: Open-source software for high-throughput quantum chemistry. The Journal of Chemical Physics 152: 184108.
- Kirsty Liddiard, Dan Goodley and Katherine Runswick-Cole - Working with young disabled people
-
Living Life to the Fullest has given children and young people with ‘life-limiting’ or ‘life-threatening’ impairments opportunities to ‘speak about their lives in “new” ways: as joyful, creative, fun, challenging, but ultimately liveable, just like anyone else’. [1] Dr Kirsty Liddiard and colleagues from the School of Education explain how open research has been central to this ESRC-funded project.
One of the main aims of Living Life to the Fullest was to enable disabled children and young people to tell their own stories through engagement with the arts. To achieve this, we established a Co-Researcher Collective made up of young disabled people who co-led the inquiry with us. This meant that the project was grounded in academic and social openness, ensuring the research was accessible to everyone involved, from co-researchers, participants and their families to allies and community partner organisations.
Applying open research in disability studies
We undertook a number of public engagement and knowledge translation activities, including conferences and research festivals, both in person and online. Sharing our research outside of the project proved beneficial not only to young disabled people but also to organisations and employers. This was demonstrated by an event to which we invited Youth Employment UK (YEUK), a leading organisation working to change the youth employment landscape. Our co-researchers took the opportunity to share their insights on how employers can better support young disabled people, and as a result YEUK recognised the need to revise their resources for young people and also for employers to be more inclusive of young disabled people. We subsequently worked with YEUK to develop a webinar and written guide for employers, both of which were shared with over 700 organisations.
With the help of ESRC Festival of Social Sciences funding, we hosted several successful events in Sheffield, and we have had the opportunity to reach regional and national audiences through contributions to BBC television and radio. We were also delighted to commission a public art installation by Louise Atkinson and to support participants in submitting their artwork to the Rightfullives online art exhibition.
Other activities promoting open research included our work with Canine Partners, a registered charity that transforms the lives of disabled people through assistance dogs. Following further exploration of our early findings, we contributed to the Canine Care Project, which featured a report and a professionally animated film that are openly available and fully accessible to disabled young people and their families.
Looking to the future
In addition to sharing the results of our research, our open research culture motivated us to share our unique ways of working with others in our Co-production Toolkit. Why Can’t We Dream? is an online collection of freely available, downloadable resources co-designed and built with our research partners. The toolkit shows how a diverse team of academics, organisations and young disabled people can successfully carry out a co-produced, arts-informed research project. The toolkit provides a range of resources for researchers, charities, non-profit organisations and schools wishing to work in co-production with marginalised young people. Offering podcasts, films and lesson plans, the toolkit has already been adopted by a number of organisations around the UK, and we hope that our culture of open research will lead to more exciting projects and opportunities for young people in the future.
Our open research
- Research processes and findings shared via videos, blog and social media
- Openly available online toolkit created for organisations working with young disabled people
- Open and transparent collaboration with young disabled people and partner organisations
References
[1] Living Life to the Fullest. [Website] https://livinglifetothefullest.org/about/
[2] Liddiard, K. et al. (2019). Working the edges of Posthuman disability studies: theorising with disabled young people with life-limiting impairments. Sociology of Health and Illness 41(8): 1473-1487. https://doi.org/10.1111/1467-9566.12962
[3] Liddiard, K. et al. (2018). “I was excited by the idea of a project that focuses on those unasked questions”: Co-Producing disability research with disabled young people. Children and Society 33(2): 154-167. https://doi.org/10.1111/chso.12308