ETH ORD Measure 2 - Explore Proposal

FAIRqual - FAIR Data Practices for Qualitative Research in Transdisciplinarity

Authors
Affiliations

USYS Transdisciplinarity Lab ETH Zurich

USYS Transdisciplinarity Lab ETH Zurich

USYS Transdisciplinarity Lab ETH Zurich

Global Health Engineering, ETH Zurich

Global Health Engineering, ETH Zurich

This is a copy of the proposal as it was submitted to, and supported by the Open Research Data Program of the ETH Board.

Abstract

While ORD practices become increasingly widespread, one area that remains a challenge is qualitative data. On the one hand, qualitative data is more difficult to process and make available in open practices. On the other, ethical norms in research practice require confidentiality of research subjects. Yet interview transcripts, workshops or other kinds of qualitative data are difficult or sometimes impossible to anonymize. This challenge comes to the fore when conducting transdisciplinary (Td) research, where new forms of engagement between science and society co-produce problem framings and project outputs. In this context, questions of who processes and stores data are increasingly important. Td research makes frequent use of qualitative methods, especially interviews and workshops. Sharing of this data could allow for improved learning between Td processes and increase engagement between science and society. How might this data be shared according to FAIR principles? What are appropriate protocols for determining what data to share and how to navigate the ethical issues of research participant protection and the benefits of sharing qualitative data? In this project we will prototype tools for FAIR qualitative data within Td research projects, develop standards and guidelines for other Td researchers and build a dialogue and community of practice within the Td research community around FAIR practices for qualitative data.

Keywords

transdisciplinary research, qualitative methods, FAIR data practices, mixed media data, open science

Proposal full title

FAIRQual - FAIR Data Practices for Qualitative Research in Transdisciplinarity

Background and motivation

Section 1.1: Qualitative data and Mode 2/Td Research – A new frontier for open data

In the last decades, participatory qualitative and transdisciplinary (Td) research are on the rise (OECD, 2020). Td research allows for co-production of knowledge between societal and scientific actors in order to tackle sustainability challenges (OECD, 2020). This turn into more participatory and dialogical research settings has strong implications for qualitative methods, and therefore for the data produced. Data is no longer collected and/or analyzed solely by the researcher, but an active role from practitioners and other societal actors is expected. How to then deal with accessibility and management of this data and secure a fair and equitable access to it? In this way, transdisciplinary research implies a new frontier to qualitative methods and data as it poses new questions on who has the power to create and manage such data, who is entitled to access it and how (Pohl et al., 2021).

Qualitative research includes different methods, the most common of which include interviews, focus groups, participant observation, ethnographic research, and workshops. These methods are appropriate for sensitive topics or topics needing an in-depth exploration. They usually can collect a large amount of data from transcriptions or other recording methods as videos. Within qualitative research, standard practice, as outlined in ethical guidelines, is to assure confidentiality of the data acquired (for example, the ETH Zurich Research with Human Participants guidelines). Participants are asked to sign detailed consent forms which specify who has access to the provided answers and how the data is going to be used for research. Commonly, audio files are transcribed into text documents where the DOCX file format is common. The text documents are then analyzed using software such as NVivo, MaxQDA, or Atlas.ti. Anonymized quotes are used in the research paper, where potentially identifying details are removed. The audio files and transcribed text documents are typically stored on the researcher’s computer or a secure server. Ethics procedures require that this data be deleted after a certain number of years, often between 5 and 10 years. Recent granting requirements ask for a Data Management Plan. However, generally there is no appropriate long-term archiving allowing for open access to the data.

The importance of confidentiality in qualitative research is a fundamental pillar of social science research ethics. The promise of confidentiality creates an atmosphere of trust, allowing the researcher to obtain an authentic and deep understanding of the research topics. This practice can be appropriate for sensitive information and where an individual’s privacy needs to be protected. However, there are also cases for semi-structured interviews where anonymisation of participants is possible, and the interview content itself does not contain sensitive information. Or there may be cases where participants want to be named and have their perspectives openly shared. In these cases, the research data can be made available for other researchers to use. This is a common practice in quantitative research where data is made available in a data repository such as Zenodo. The research data is typically published in CSV format, which is a popular choice for quantitative research because it is a standardized open and machine-readable format.

Currently, many transdisciplinary researchers using qualitative methods face a quandary: Standard research ethics practices dictate that all human subjects’ data should be strictly confidential. Yet increasing demand from journals and even in cases from the project participants themselves, suggests that in many cases, the sharing of qualitative data could be valuable. To move forward the access to and the equitable sharing of qualitative research data within Td research, this proposal explores how this data can be shared in machine-readable formats. First, we will prototype ORD tools: we will either store or convert transcripts of semi-structured interviews in markdown file format with the aim to test and implement protocols to make qualitative data open and accessible. From there, we will extract text and transform it into a table format. The data will be shared using an established R data package workflow. Second, we will specify ORD standards by creating a set of guidelines for FAIR qualitative data in Td, based on our prototype and an analysis of practices and training needs of the Td community. Finally, we will engage our extensive network in building a community of practice around FAIR qualitative data in Td research, via networking and education events such as workshops and a webinar; creating and uploading guidelines on Td websites; and publishing an open-access journal article outlining guidelines and motivations for FAIR qualitative data in Td research.

Section 1.2: USYS TdLab

The Transdisciplinarity Lab (TdLab) at the Department of Environmental Systems Science (D-USYS, ETH) is a place where students, teachers, researchers, and other societal actors collaboratively define and tackle the complexities of sustainable development to respect environmental limits and uphold social justice for present and future generations. The lab conceptualizes and tests educational and research approaches to address real-world problems in the context of sustainable development through collaborative and reflective teaching and research. We give students freedom and responsibility to explore and experience interactive ways of doing methodologically sound research that involves active connection with stakeholders.

TdLab is well connected to national and international networks, including the Global Alliance for Inter- and Transdisciplinarity, Swiss Academic Society for Environmental Research and Ecology, the Network for Transdisciplinary Research, the tdAcademy and the newly formed Society for Transdisciplinary and Participatory Research. In addition, the members of TdLab, including senior staff, postdocs, research assistants and PhD students, each bring their own networks ranging across topics from living labs, regional development, sustainability, international cooperation, energy transitions, biodiversity, climate adaptation, circular economies, food systems and more. All members of TdLab have experience in qualitative or Td research–we will draw on this wealth of knowledge, experience and networks to support and advance our project.

The project co-applicants, Mollie Chapman and Bianca Vienni-Baptista are well-established researchers in their respective fields (conservation social science and cultural studies of science and technology, respectively). They each bring extensive experience in qualitative methods and transdisciplinary research, having led and implemented projects around the world (Argentina, Uruguay, Colombia, Costa Rica, Japan USA, Canada, Germany, Switzerland). Each is experienced in project management, interdisciplinary collaboration and supervision of early career researchers and staff. Christian Pohl is co-director of TdLab and brings decades of experience in workshop facilitation and Td research, as well as qualitative methods. We plan to hire a postdoctoral researcher to be housed at TdLab to contribute to the development of the tasks detailed in this proposal. Potential candidates from our extensive networks have already been contacted, whose experiences span from qualitative methods and Td research and are well connected to both communities.

Section 1.3: Data stewardship support

At ETH Zurich, we have identified the Data Stewardship Network as a helpful resource to support us in developing strategies for sharing of qualitative research data. We have engaged with data steward, Lars Schöbitz, who will support the technical elements of sharing qualitative research data, following commonly used FAIR principles for data sharing. The data steward is employed within the Global Health Engineering group at ETH Zurich, which has previously worked on a first prototype of sharing qualitative research data (Schöbitz et al., 2023).

Section 1.4: RDM Guidelines

The Guidelines for Research Data Management at ETH Zurich (RDM Guidelines) require all research data at ETH to be shared as Open Research Data (ORD) (ETH, 2022). There are existing guidelines provided by the RDM team of the ETH Library, but these provide general guidelines rather than a specific practical application of how to share qualitative research data. The Qualitative Data Repository (QDR), hosted by the Center for Qualitative and Multi-Method Inquiry at Syracuse University, USA provides services for storing and sharing data generated through qualitative and multi-method research. The repository adheres to FAIR principles for data sharing, as it provides persistent Digital Object Identifiers (DOIs), defines licenses, and follows metadata standards such as DataCite and Dublin Core to facilitate data discovery and interoperability. Finally, it offers an active data curation service to increase potential for reuse. It is the best example for existing ORD best practices in the social sciences. Their existing services and guidelines will help us contribute to build up missing guidelines within the ETH domain. Through the Data Stewardship Network, we can integrate our learnings with ORD services at ETH and support the RDM efforts of the ETH Library.

Problem statement

Despite the importance of confidentiality of data in some research contexts, we are curious about the necessary ubiquity of the principle for all qualitative research. Td research involves interfaces between science and society, where researchers and societal actors work together on challenges that neither could address alone (Gibbons, 1999). While Td research involves many different methods and scientific disciplines, qualitative research often plays a central role–either to facilitate the Td process or as part of the scientific study itself. Such processes can be resource-intensive: organizing workshops and focus groups that bring together diverse societal actors, conducting in-depth interviews of one or more hours each with 20 to 40 study participants, and developing visions and problem framings of societal challenges. Large amounts of data are collected and much is learned in the process. In some cases the Td process is also studied and the results published (see for instance, Vienni-Baptista et al. (2022))—but in others only the “scientific” studies themselves are published. Important learning and insights and the opportunity to compare across cases and over time, are lost. Beyond this, there may be cases where Td projects themselves would benefit from greater sharing of data, e.g., to build support for a new initiative or to increase the possibility of participation from diverse stakeholders.

ORD project plan

The scholarship on transdisciplinary research, when entailing co-production of knowledge with stakeholders and other scientific partners, has not yet touched upon this topic. Therefore the background studies on how qualitative research is dealt with in transdisciplinary settings is scarce (an example of open science recommendations for qualitative methods is Alexander et al. (2019), however this does not address Td research contexts). However, our transdisciplinary communities face this challenge in every single project. Ways to overcome this challenge have included the development of new methods to systematize experiences (such as Robledo et al., 2023) and participatory qualitative research and the further elaborations such as engaged anthropology (see for instance, Larsen et al. (2022)). For example, with the method of photo-voice, research participants take photos of their daily lives and explain to researchers why these are important to them; often these projects involve a photo exhibition of the final works. In this method, the sharing of the results and ‘data’ including individual identities is an important part of the participatory and empowering goals of this method. Humanities and social sciences, and more specifically, qualitative researchers have begun attending to the problem of integrating the FAIR principles into co-productive settings, work we will build upon. Nevertheless, we see an urgent demand and need to further elaborate on the access to qualitative data. In the spirit of Open Science, we aim at sharing what’s shareable so that others can reuse and learn from it. An unintentional benefit of applying Open Science principles to our work will be unforeseen reuse cases that can be applied by researchers outside of the Td community and qualitative research field. We therefore seek to understand the conditions under which qualitative data collected as part of Td processes can or should be shared following FAIR principles (Wilkinson et al., 2016); to develop best practices for doing so; and to build a community of practice for the sharing of data within the Td field.

WP1: Identification of ORD tools and workflows for publication of FAIR qualitative data in Td research

Goal

Our first step will be to elaborate the technical requirements of sharing qualitative data and consider what if any changes would be required to standard methods for qualitative data collection.

Activities

  1. Activity 1.1: Literature review on existing practices of FAIR and open data for Td and qualitative research
  2. Activity 1.2: Identification of tools for the collection of and transcription of data
  3. Activity 1.3: Development of workflows for transformation of text files in different formats to machine-readable table format

Research Questions

  1. RQ 1.1: What procedures are currently used by researchers and practitioners to collect qualitative data in transdisciplinary settings? How do they assure FAIR principles?
  2. RQ 1.2: What are best practice workflows for data management in order to share qualitative data?
  3. RQ 1.3: What kinds of possibilities exist for collecting and sharing data from transdisciplinary methods using qualitative techniques, which might include workshops using flip chart drawings, post-it notes, group conversations, marked up maps, and other mixed media?

Aims addressed

In this step we will develop an initial prototype of ORD standards for qualitative research data and for workshop data based on the systematization of previous experiences and review of the relevant literature.

WP2: Evaluation of current data practices and identification of training and education needs

Goal

In this work package we will use two approaches to elicit ideas about sharing qualitative data generated during Td research projects. As a first step, we will organize a workshop at the International Transdisciplinarity Conference (ITD) organized by the Transdisciplinary Research Network (td-net, Swiss National Academies of Science), to be held in November 2024. This conference brings together scientists and practitioners from diverse fields who design and implement Td methods, tools and theories. Holding a workshop in this setting will allow us to gather insights from this network, as well as to present the idea of open Td data to this research community. Complementary contributions in other conferences are also planned for WP4, to present findings to the larger interdisciplinary and transdisciplinary communities. The Network for Transdisciplinary Research (td-net) organizes an annual Transdisciplinary Day, which also constitutes an ideal setting to further discuss with and raise awareness of this project.

As a second step, drawing on insights and networks from the workshop and other relevant panels during the ITD conference, as well as TdLab’s existing networks, we will conduct interviews with Td scientists as well as practitioners and participants from previous Td projects. We will ask researchers and practitioners questions such as: Which situations have you experienced or can you imagine where sharing of qualitative data would be beneficial for the project? Under which circumstances should data remain confidential? What are your experiences of anonymizing interview data? Would you consider reusing qualitative data from another Td study? What metadata, access conditions, or other resources would you need for this data to be useful? Who should determine when and how qualitative data is shared? Are there ways that stakeholders or interview respondents might participate in how their data is shared (e.g., reviewing and elaborating transcripts, or selecting sections to publish and which to redact)? What are your experiences, concerns or ideas for sharing mulit-media data, such as that from workshops, or visual methods? What kinds of technical support and training materials would you need to publish your qualitative data according to FAIR standards?

Activities

  1. Activity 2.1: Ethics application explaining the unique character of our study.
  2. Activity 2.2: Workshop at the ITD Conference 2024 to identify and systematize diverse perspectives on the topic
  3. Activity 2.3: Identification and recruitment of interview respondents. Conduct interviews (in-person or via zoom) with 10 to 20 respondents
  4. Activity 2.4: Transcription and conversion of interview audio files
  5. Activity 2.5: Qualitative coding and analysis of interview data
  6. Activity 2.6: Transformation of interview data to ORD using workflows defined in WP1

Research Questions

  1. RQ 2.1: Under what conditions does sharing of qualitative data generated during Td research provide benefits for a) research participants, b) the Td project itself, or c) the Td community?
  2. RQ 2.2: What are the risks of sharing Td data openly to the groups listed above (RQ 2.1)? How can these be mitigated?
  3. RQ 2.3: If qualitative data is shared, what are potential formats for sharing this data to make it not only available to researchers but also to societal actors?
  4. RQ 2.4: What kinds of concerns or benefits are specific to particular kinds of data? In other words, how is interview data different from workshop data? How might multia-media data be shared?

Aims addressed

This WP focuses on exchange with other researchers and stakeholders to understand the risks, costs, and benefits of sharing qualitative data generated as part of Td processes. As well, the process of conducting the workshop and interviews will help to establish a community of practice around open qualitative data within Td research projects. The interviews will serve as data to prototype the ORD protocols we develop, a case study for the example website and training materials, and also inform us as to the needs of the community.

WP3: Development of guidelines and protocols for FAIR qualitative data in Td research

Goal

Based on the results of WP1 and WP2, our team will develop guidelines for when and how qualitative data should be made open in Td research projects (including ethical, methodological and technical guidelines). We will then share, refine and improve these guidelines via a 1 day in-person workshop at ETH. We will invite 10-20 researchers, stakeholders, data scientists and other relevant experts to collaborate on the development of the guidelines. The workshop will also serve as a networking and educational event, to foster dialogue on FAIRqual and to inspire other teams to implement open science. These guidelines, which we will share and discuss broadly in WP4, will serve as an invitation to greater participation in open science among the Td community and as a means to set best practices for future research projects.

Activities

  1. Activity 3.1: Development of conditions for when sharing of qualitative data from Td processes is valuable, or when it should be avoided
  2. Activity 3.2: Development of technical protocols for how to share specific types of data (interview transcripts, workshop findings, multimedia)
  3. Activity 3.3: Create example data sharing website using the data generated in our project.
  4. Activity 3.4: 1 day in-person workshop at ETH to develop fully usable guidelines

Research Questions

  1. RQ 3.1: What are best practices in qualitative data sharing in Td research?
  2. RQ 3.2: What are the technical protocols needed for sharing different kinds of qualitative data according to FAIR standards?
  3. RQ 3.3: What are the needs of researchers to make these practices actionable in Td settings?

Aims addressed

This work package defines a critical step in developing standards for ORD. We will lay out guidelines for how to navigate the challenge of assuring protections for research subjects while also creating opportunities for sharing of data. By doing this, we will not only enrich the current discussions on Td research but also in ethical requirements to be pursued at ETH Domain.

WP4: Building a community of practice for FAIR qualitative data in Td research

Goal

In this WP we will engage in several activities to broadly share our findings with the Td community.

Activities

  1. Activity 4.1: Interactive webinar to share our results and dialogue with Td community
  2. Activity 4.2: Sharing of website as a demo project, with data collected from our project displayed in engaging formats
  3. Activity 4.3: Publication of scientific article describing our project and results in a widely read Td or qualitative journal
  4. Activity 4.4: Develop protocols suitable to be shared online in on-going Td toolkitsonline (such as ShapeID, TdNet toolbox).
  5. Activity 4.5: Presentation of FAIRQual guidelines and protocols at Td and qualitative conferences by members of the TdLab

Aims addressed

This WP focuses on building communities of practice to engage in ORD practices. We will employ our own and the TdLab`s extensive national and international networks, as well as additional networks generated during WP3 and combine these with ORD networks and qualitative research networks to broadly share our findings and engage these communities in conversations about this topic. We specifically target Td researchers using qualitative methods, but will as a secondary community, also reach out to various qualitative research networks, for example, TdLab is an institutional member in the Gesellschaft für transdisziplinäre und partizipative Forschung (Society for Transdisciplinary and Participatory Research), which includes both Td researchers as well as qualitative researchers using participatory methods from such areas as public health.

Project Implementation and Risk Mitigation

Building a FAIR qualitative data community of practice for Td is a high risk/high gain endeavor. We plan to anticipate and mitigate these risks as follows. Given the long tradition of confidentiality in qualitative research, we anticipate that many researchers will be reluctant to engage in ORD practices and may even oppose these on ethical grounds. We will mitigate this risk by actively bringing these concerns into our guidelines and by identifying benefits of ORD practices for researchers, research participants and stakeholders. As the Td community is based on values of open communication, collaboration and dialogue, we expect that a meaningful path forward can be found. A further risk is that the ETH ethics board may find our proposed project challenging. We anticipate opening a dialogue and potentially even collaborating with the ETH ethics board to initiate the process of developing new protocols for FAIRqual. Given the dual ethical needs to protect research subjects and for FAIR and open data, we see a productive tension in which to initiate new avenues and approaches.

Impact

How sustainable is the proposed project inside the ETH Domain?

As the project will be housed in TdLab, which is a leader on transdisciplinary methods within the ETH domain and internationally, the key insights will be shared across the whole lab group (around 15 scientists and staff) and also widely communicated to the ETH domain. TdLab also serves as a central node in teaching and research collaboration within D-USYS and ETH domain more broadly; researchers from many fields come to TdLab for advice, collaboration and support in transdisciplinary methods and tools into their research. In this way, the insights generated in this project, and the specific protocols and methodologies will radiate out from the project partners. TdLab also maintains a strong institutional memory, such that the project’s guidelines and protocols will be sustained for future Td projects and scientists. Both co-applicants of this proposal are Group Leaders within the TdLab. The outcomes from this proposal will also benefit the research projects from early career researchers working in transdisciplinary contexts. As lecturers in several courses at D-USYS and internationally, the co-applicants will also use the protocols and guidelines for teaching and learning purposes.

To what extent will the planned project substantially advance the ORD field, or solve a critical outstanding problem in the targeted field(s)?

In our target field of qualitative research within Td projects, the challenge of the dual demands of open science and participant confidentiality is a critical outstanding problem. Journals and funders increasingly demand open science, data management plans, and more accessible research. Yet standard practices for qualitative research seem to contradict this need. How can Td researches navigate this tension? By developing guidelines on responsible, ethical ways to share qualitative data in FAIR formats, we will meet a critical need of the Td community. Such an approach has potential for widespread uptake within the Td community. Td research itself is based on ideas of science-society interaction, leaving the ivory tower, and creating socially meaningful science. These goals align with those of the ORD movement, creating a strong synergy. As well, the field of Td research is dynamic and open to change, and TdLab is considered a leader in Td methods, such that we see potential for widespread adoption of the guidelines and protocols we develop.

To what extent does the proposal support a collaborative approach, involving significant synergies, complementarities, or/and scientific added-value to achieve its objectives?

Our project involves collaboration between data scientists and qualitative researchers–a novel combination. It is deeply collaborative; neither field would have the expertise to conduct this project on its own. Expertise in qualitative research and Td methods, as well as in data science and stewardship is needed to create a set of guidelines and protocols that are truly usable. Alone, qualitative researchers could suggest when and what kinds of data to share, but would not be able to elaborate how to technically do this. Alone, data stewards could suggest technical protocols to share data, but not how to navigate the tricky ethical challenges. By combining these expertises, we will create a “whole product” that guides future Td qualitative research from ethics to file types.

Work Packages and milestones

The following Google Sheet shows a basic gantt chart against the four work packages, including program activities. Column “Lead” abbreviations:

LS: Lars Schöbitz CP: Christian Pohl

TdLab: Mollie Chapman and Bianca Vienni-Baptista, postdoctoral researcher (to be hired), with support of TdLab staff and students

Google Sheet: https://docs.google.com/spreadsheets/d/14kl5jumfwk3jwxwrnJEz9Jvnui9vl-gOpq2utZIN7X0/edit#gid=0

Bibliography

Alexander, S. M., Jones, K., Bennett, N. J., Budden, A., Cox, M., Crosas, M. x000E8, Game, E. T., Geary, J., Hardy, R. D., Johnson, J. T., Karcher, S., Motzer, N., Pittman, J., Randell, H., Silva, J. A., da Silva, P. P., Strasser, C., Strawhacker, C., Stuhl, A., & Weber, N. (2019). Qualitative data sharing and synthesis for sustainability science. Nature Sustainability, 1–8. https://doi.org/10.1038/s41893-019-0434-8
ETH, Z. (2022). Guidelines for Research Data Management at ETH Zurich (RDM Guidelines).
Gibbons, M. (1999). Science’s new social contract with society. Nature, 402(6761), C81–C84. https://doi.org/10.1038/35011576
Larsen, P., Bacalzo, D., Naef, P., Tibet, E. E., Baracchini, L., & Riva, S. (2022). Repositioning Engaged Anthropology: Critical Reflexivity and Overcoming Dichotomies. Swiss Journal of Sociocultural Anthropology, 27, 4–15. https://doi.org/10.36950/tsantsa.2022.27.7994
OECD. (2020). Addressing societal challenges using transdisciplinary research. OECD. https://doi.org/10.1787/0ca0ca45-en
Pohl, C., Klein, J. T., Hoffmann, S., Mitchell, C., & Fam, D. (2021). Conceptualising transdisciplinary integration as a multidimensional interactive process. Environmental Science & Policy, 118, 18–26. https://doi.org/10.1016/j.envsci.2020.12.005
Schöbitz, L., Loos, S. C., Kalina, M., Ogwang, J. O., Kwangulero, J., & Tilley, E. (2023). Biogasoutcomesmalawi: Data for 61 semi-structured interviews with biogas owners in Malawi (Version 0.0.1). https://doi.org/10.5281/zenodo.6470427
Vienni-Baptista, B., Goñi Mazzitelli, M., García Bravo, M. H., Rivas Fauré, I., Marín-Vanegas, D. F., & Hidalgo, C. (2022). Situated expertise in integration and implementation processes in Latin America. Humanities and Social Sciences Communications, 9(1), 1–14. https://doi.org/10.1057/s41599-022-01203-7
Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., … Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1), 160018. https://doi.org/10.1038/sdata.2016.18