Tensions and learnings in research-program partnerships undertaking RCTs

by 17th Mar 2020

By Cari Jo Clark, Sudhindra Sharma, and Kathryn M. Yount

The recent Sveriges Riksbank Prize in Economic Sciences, awarded to Abhijit Banerjee, Esther Duflo and Michael Kremer “for their experimental approach to alleviating global poverty,” and on-going collaboration with CARE colleagues on the Tipping Point randomized controlled trial (RCT) offers an opportunity to reflect on lessons learned in research-program partnerships involving RCTs. We offer our reflections on the possibilities and tensions of RCT designs to evaluate programs designed to prevent critical social problems that primarily affect girls and women—such as child, early and forced marriage and other forms of gender-based violence (GBV). Discussions about RCTs are underway in various fields, including in a special series in the journal World Development. The field of GBV prevention has not yet had the same level of public debate, so we share our contribution here.

The reasons to choose an RCT design in violence-prevention research are many. RCTs are considered a gold standard in research to assess program impacts. The rigor and requirements for pre-trial planning and transparency regarding outcomes, analysis, and ethical oversight minimize biases that can otherwise creep into the design. This rigor can limit desirable changes to program design, implementation, and management that are discovered during the trial. For organizations that undertake RCTs, these constraints may be felt throughout the implementation process.

Aligned or misaligned timeframes

One of the first tensions often arises over differences in the timing of program and research rollout. Programming often is more efficient and cost effective when it can proceed freely without the other coordination demands that come with an RCT. Unlike research activities, pre-programming activities can occur while formal permissions for program rollout are being sought. In comparison, preparatory research activities such as training and data collection can occur only after the study design is finalized and local and international ethical approvals have been granted. When these timelines are not aligned well in advance, the program organization may face costly delays and ad hoc modifications, which are never resource neutral and can adversely affect the quality of the research and the programming.

Demands of pre-trial data collection

RCTs often are funded and deployed with little of the information needed for strong designs and risk mitigation.[1] “Feasibility” in RCT terms means a well-documented list of sites that are equally eligible, feasible for program implementation, politically and ethically randomizable to “treatment” or “control” conditions, and well characterized with data for pre- or post-trial matching. Feasibility also entails knowledge and preparation for the necessary investments of time and other resources to recruit the required number of participants and to sustain participation at levels that equal or exceed expected levels of non-participation and dropout over study waves. Recruiting and sustaining control participants is tricky, especially if engagement with them is minimal while the program intervention is underway in treatment areas. Unfortunately, the formative work and data needed to make informed decisions about sampling, recruitment, and retention often are lacking. Pre-trial data collection often is not budgeted or timed well, despite its necessity where routine data systems (census data, maps etc.) are lacking.

Generating the sample: an intricate dance between researchers and programmers

Sampling in RCTs is challenging. Though a sample design may be planned at proposal stage, the actual sampling of eligible participants before baseline is challenging for many reasons. For community-based (non-facility based) trials, a census of the study area must be completed first. Then, the research and program partners must have a common understanding of eligibility. These steps generally occur after the census and before training for the baseline survey. Misalignment of research and program timeframes can place intense time constraints on the local research partner. Constant interaction between program and research partners also is needed when sample selection occurs. The list of specific individuals to be sampled should be shared with the research partner before baseline survey training begins. A delay in sharing this list could adversely affect the timing of the baseline vis-à-vis program implementation, with implications for assessing the causal impact of the program.

Navigating the demands of RCTs and local gatekeepers

The identification and selection of “control” areas in the field, particularly in lower-income settings, while critical to the RCT design, is difficult for program partners. Local governments and communities often expect the program to cover an entire administrative area. The choice to randomize clusters within an administrative area to “non-treatment” often contradicts the ethics of program delivery and challenges organizational relationships with local governments and communities. Further, the need for control conditions places pressure on local program partners to juggle the demands of an RCT and those of local government and communities–whose agreements are needed before implementation can get underway. While this challenge is real and has significant impact on the feasibility of a randomized design, limiting widespread exposure to programming which may be ineffective, or have negative unintended consequences is not unethical and may be a waste of precious resources.[2]

Selection Bias

Tensions between common practice in program implementation and the rigor of RCT designs also are notable. Program organizations often recruit participants in intentional (non-random) ways, with understandable consideration to feasibility and programmatic goals. While defining program eligibility is essential, non-randomly selected eligible program participants may differ in observed and unobserved ways other eligible individuals in the intervention community, and in the control areas. These differences challenge our ability to attribute outcomes to the program.[2] Probability sampling reduces this risk, but often is not practiced,[3] given the practical challenges. These threats are compounded if the control and intervention groups are selected differently.

Generalization of Trial Findings

Many programs/RCTs emphasize internal validity but end up answering very specific questions in select participants and locations. Evidence of program effectiveness in a particular setting often is sufficient for donors and governments for scale-up; however, characteristics of individuals and the setting may affect program effectiveness, limiting the extent to which findings from any one RCT can be generalized to other settings, even within the same country.[3,4]

“When does the research butt out?”

One of the challenges that organizations face when leading an RCT to evaluate their programming is that outside parties are influencing their usual operations in ways that were never anticipated. There is likely to be palpable concern over what feels like a loss of control over programming to researchers, but a sustained responsibility and accountability to the funder. A programmatic colleague summed it up nicely one day when she said, “So when does the research butt out?”. The statement was made when normal pace and operation of the programming butted up against the needs of the RCT. The truth of the matter is that the rigor and monitoring required of an RCT keeps researchers engaged in a sustained way, rather than simply at ’baseline’ and ‘endline’ data collection. All programmatic decisions influence the design and rigor of the RCT. Decisions made over the course of program delivery have the potential to influence the outcome of the trial and its ability to answer its research question, while the conduct of the trial has an on-going impact on program delivery. Further, the fact that many or all of the program participants also have agreed to participate in the research, necessitates ongoing collaboration about what would otherwise be run-of-the-mill programmatic decisions. In reality, a high level of engagement between the research and program teams continues not only throughout program implementation, but also through the interpretation and dissemination of findings, keeping researchers, who often are outsiders to the implementing organization, “butting in” for what might feel like forever.

Learning from the Tipping Point Impact Evaluation and other GBV-focused evaluations

1) Partnership between program and research teams at proposal development, well in advance of the design stage, is needed to provide early opportunities for mutual learning and trust building and to minimize the impact of the research on programming during the years of project implementation. Joint proposal writing helps to develop research and programmatic goals, cycles, and timelines that are shared and well integrated. This process also supports the development of a partnership founded on symmetrical relationships, complementary expertise, and shared learning—the ingredients that facilitate difficult decisions when tensions invariably arise between science and practice in the field.

2) Pre-trial data gathering should be built into timelines and budgets so formative work meets the needs of the program and the research. Effective communication with donors of the importance of this data-collection phase would support rigorous and feasible research designs and better alignment between the research and programmatic processes.

3) Embedding the program into pre-existing groups and local organizations often allows for comparably selected intervention and control groups and potentially offers a greater opportunity for sustainability compared to programming that is set up outside of existing structures.

4) Where random selection is not possible, quasi-experimental designs might be more appropriate.[2] If random selection of communities or other clusters is possible, but random selection of participants is not possible, having pre-baseline census data in intervention and control communities enables researchers to compare program participants and non-participants and to consider post-hoc weighting of the sample. One also could ask eligible participants in intervention and control areas if they would be willing to participate in the programming so that comparisons between intervention and control respondents can be made among those who would participate if given the chance. This process is made easier and more ethical if the control condition will be receiving something in their communities. If the control group will not receive any programming, inquiring about participation may raise expectations that will not be fulfilled.

5) RCTs are one of many research designs, which need not be privileged over others, given their shortcomings.[4,5] The RCT may not be the right design for a particular project at a particular time (see Box 2.1 for a general guide to this process). Its contribution to evidence generation must be considered in light of the programmatic pain and expense and with a critical assessment of the insight that a single trial can contribute to an organization’s strategic objectives. If an RCT is the desired approach, consideration of the different RCT stages, designs and randomization techniques[2,6] in light of pre-trial data and program objectives will contribute to the strongest design and value for money.

6) Strong evidence is built on knowledge that accumulates over time from multiple sources, including program participants, field staff, programmatic experts, organizational knowledge, monitoring, process evaluations, and various types of research. In the end, the synthesis of these findings, with useful examples from our team,[7-10] offers guidance about the programmatic investments that organizations can make to achieve their missions. RCTs have a valued placed among these ways of knowing, but well-executed RCTs offer answers only to specific questions, despite their common use to justify larger strategic decisions.[4] The debate about RCTs is an opportunity to reconsider their place, among the multiple ways of knowing, that are available to organizations seeking to have an enduring, positive impact on humanity—including the rights of women and girls.

Suggested Readings

1. Avdeenko A, Frolich M. Research standards in empirical development economics: what's well begun, is half done. World Development. 2020;127:104786.
2. White H. An introduction to the use of randomised control trials to evaluate development interventions. Journal of Development Effectiveness. 2013;5(1):30-49.
3. Deaton A, Cartwright N. Understanding and misunderstanding randomized controlled trials. Soc Sci Med. 2018;210:2-21.
4. Muller SM. The implications of a fundamental contradiction in advoacating randomized trials for policy. World Development. 2020;127:104831.
5. Frieden TR. Evidence for Health Decision Making - Beyond Randomized, Controlled Trials. N Engl J Med. 2017;377(5):465-475.
6. Smith PG, Morrow RH, Ross DA. Field trials of health interventions: a toolbox. Field Trials of Health Interventions: A Toolbox. Oxford (UK)2015.
7. Grose RG, Chen JS, Roof KA, Rachel S, Yount KM. Sexual and Reproductive Health Outcomes of Violence Against Women and Girls in Lower-Income Countries: A Review of Reviews. The Journal of Sex Research. 2020:1-20.
8. Grose RG, Roof KA, Semenza DC, Leroux X, Yount KM. Mental health, empowerment, and violence against young women in lower-income countries: A review of reviews. Aggression and Violent Behavior. 2019;46:25-36.
9. James-Hawkins L, Peters C, VanderEnde K, Bardin L, Yount KM. Women’s agency and its relationship to current contraceptive use in lower-and middle-income countries: A systematic review of the literature. Global Public Health. 2018;13(7):843-858.
10. Yount KM, Krause KH, Miedema SS. Preventing gender-based violence victimization in adolescent girls in lower-income countries: Systematic review of reviews. Social Science & Medicine. 2017;192(Supplement C):1-13.

Cari Jo Clark, Sc.D., M.P.H. is Associate Professor in the Hubert Department of Global Health at Rollins School of Public Health. Her research is focused on the health effects of exposure to child maltreatment and intimate partner violence, the measurement of violence and its associated norms, and the design and evaluation of primary and secondary prevention strategies. Dr. Clark is involved in evaluations of primary and secondary violence prevention interventions in a US-based health system, a Jordanian college campus, and among adolescents, adults, and communities in Nepal.
University website: https://www.sph.emory.edu/faculty/profile/index.php?FID=8858
@cari_jo_clark

Kathryn M. Yount, PhD, is Asa Griggs Candler Chair of Global Health (2012) and Professor of Global Health and Sociology (2015) at Emory University. Her research centers on the social determinants of women’s health, including mixed-methods evaluations of social-norms and empowerment-based programs to reduce gender-based violence (GBV) and health disparities in underserved populations. She has been funded since 2002 from U.S. federal agencies, foundations, and foreign agencies to work in Asia, Latin America, the Middle East, Sub-Saharan Africa, and Atlanta, contributing more than 180 publications. She is founding director of GROW, which advances scholarship, leadership, and global change in women’s and girls’ empowerment, GBV prevention, and women’s health. In 2016, Dr. Yount received the university-wide Women of Excellence award for mentoring. She is member of the Population Association of America’s Board of Directors (2018-2020), advisor on DfID- and Gates-funded consortia, and member of the DfID-funded Gender and Adolescence: Global Evidence (GAGE) consortium, in which she leads the Room-to-Read Girls’ Education Program evaluation in Nepal. Dr. Yount has served as President/Chair of the Emory University Senate and Faculty Council (2015), member of Provost committees to revise university policies on authorship, scholarship, mentorship, research ethics, and data sharing; member of all scholarship committees for the Laney Graduate School (LGS), and Social Sciences Chair of the University Research Committee, which awards seed grants to faculty. In 2019, the graduate faculty elected Dr. Yount to the LGS Executive Council, which advises the Dean on policy, curricular revisions, and new programs (2019-2022). Dr. Yount lives in Atlanta with her daughters and husband. Connect with her and her team on LinkedIn, Twitter, and Facebook.

Sudhindra Sharma, PhD, is the Executive Director at Inter Disciplinary Analysts (IDA), a research and consulting firm based in Kathmandu, Nepal. A Sociologist by training he completed his PhD from the University of Tampere, Finland in 2001 and Masters from Ateneo de Manila University, Philippines in 1992. He was involved as international consultant with The Asia Foundation in Afghanistan where he provided technical inputs in Survey of the Afghan People from 2007 to 2011. He has led over 30 nationwide surveys in Nepal, including Nepal Contemporary Political Situation (2004-2015), Public Safety and Security (2007–2011), Business Climate Survey (2010-11) and baseline, four perception survey and end-line survey based on quasi-experimental research design for USAID-funded local governance and community development program, Sajhedari Bikaas (2013–2017), and many other surveys for various bilateral donors, Government of Nepal and International NGOs. With the commencement of Survey of Nepali People in 2017, supported by DFAT, Australia, and in concert with The Asia Foundation and Kathmandu University, Dr. Sharma has led the team on behalf of IDA. Dr. Sharma has led the field implementation of research projects that have RCT designs – namely (1) Evaluation of the Welfare Impacts of Livestock Transfer Program in Nepal supported by USAID and implemented by Heifer International and in partnership with Georgia University, State University of Montana and IFPRI (2) Designing and Evaluating Innovations for Development of Small Holder Female Livestock Cooperatives in Nepal supported by USAID and implemented by Heifer International and in collaboration with University of Florida, University of Georgia and Kansas State University. He has published books in Nepal and has contributed to and co-authored several policy papers brought out by the Institute of Development Studies, University of Helsinki. He has international experience as a visiting scholar at the German Development Institute, Bonn; Institute of Development Studies, Helsinki; Center for the Study of Developing Societies, Delhi; and the Institute of Asian Studies at Chulalongkorn University, Bangkok. He was awarded Docent in Development Studies at the University of Helsinki in February 2009 and has since also begun supervising PhD candidates enrolled in Development Studies at University of Helsinki.

Cari Jo Clark

Cari Jo Clark is Associate Professor in the Hubert Department of Global Health at Rollins School of Public Health.