- Open Access
Assessing health program performance in low- and middle-income countries: building a feasible, credible, and comprehensive framework
Globalization and Healthvolume 11, Article number: 51 (2015)
Many health service delivery models are adapting health services to meet rising demand and evolving health burdens in low- and middle-income countries. While innovative private sector models provide potential benefits to health care delivery, the evidence base on the characteristics and impact of such approaches is limited. We have developed a performance measurement framework that provides credible (relevant aspects of performance), feasible (available data), and comparable (across different organizations) metrics that can be obtained for private health services organizations that operate in resource-constrained settings.
We synthesized existing frameworks to define credible measures. We then examined a purposive sample of 80 health organizations from the Center for Health Market Innovations (CHMI) database (healthmarketinnovations.org) to identify what the organizations reported about their programs (to determine feasibility of measurement) and what elements could be compared across the sample.
The resulting measurement framework includes fourteen subgroups within three categories of health status, health access, and operations/delivery.
The emphasis on credible, feasible, and comparable measures in the framework can assist funders, program managers, and researchers to support, manage, and evaluate the most promising strategies to improve access to effective health services. Although some of the criteria that the literature views as important – particularly population coverage, pro-poor targeting, and health outcomes – are less frequently reported, the overall comparison provides useful insights.
Adapting health services to meet the rising demand and evolving health burden in low- and middle income-countries (LMICs) is key to improving health outcomes. Interest in the potential for health innovations to improve quality and access of health care for LMIC populations is growing rapidly [1–3]. Many organizations, including private providers, governments, donors, and social impact investors, have developed and supported innovative approaches to health services delivery for the poor. In particular, the private health sector, which includes for-profit and not-for-profit, formal and non-formal entities , plays a significant innovative role in influencing health policy and providing health care and supplies in LMICs [5, 6]. However, our evidence on what works, particularly in the private health sector of developing regions, is relatively weak , and greater understanding of the effectiveness, scale, and scope of private sector initiatives is needed . Innovative programs are seldom evaluated in a way that allows for meaningful comparisons [9–11], and in rapidly changing health markets, formal evaluations are often too time consuming and costly for new interventions or rapidly evolving organizations . We need new approaches to improve the knowledge base on health markets in LMICs, which is crucial for improving health policy and practice . This requires a cohesive set of measures that balance credibility (relevant aspects of performance), comparability (across different organizations), and feasibility (available data).
Performance measurement frameworks seek to determine the activities and success of a program’s strategy and provide insights for future improvements . Multiple performance frameworks have been designed to assess health systems [14, 15], health service delivery organizations, [13, 16] and health quality [17, 18]. Additional frameworks measure the impact of socially responsible businesses and social enterprises [19, 20].
While some of the performance measures in existing frameworks have been rigorously tested to determine their credibility, they face substantial challenges in comparability and feasibility. Measures face comparability challenges because they are often specific tocertain practices and health areas, making them difficult to apply across health areas and models. They face feasibility challenges because they often do not consider whether programs have the capacity to collect and report the necessary data, imposing burdens that may detract from service delivery, particularly for smaller and newer health programs.
Performance measures are relevant for multiple stakeholders. Funders and researchers must compare health programs to determine what activities they are undertaking and which are performing well. Program managers are interested in the minimum data set that is relevant to operations and to assess their performance relative to their peers. Meeting the goals of these stakeholders requires that performance measures are credible in assessing relevant aspects of performance, comparable in evaluating programs across different health areas and models, and feasible for programs. The measures need to achieve a balance among these three elements of assessment.
This paper presents a balanced framework for assessing the performance of health care programs in LMICs and elsewhere. This framework integrates important existing approaches and supplements them with novel operational criteria. The result is a template that organizations can use for reporting purposes and may also serve as a practical tool for policy makers, funders, and researchers to assess programs for investment and scaling to maximize their health impact.
The Toronto Health Organization Performance Evaluation (T-HOPE) framework was developed using an iterative, qualitative process. Our aim was to develop a set of performance dimensions that balance what is theoretically desirable and what is empirically viable, with an emphasis on identifying measures that are credible, feasible, and comparable across health programs.
Literature - credibility and comparability
We began by consolidating eleven existing performance frameworks on health service evaluation [14–18], social impact investment [19–21], and business process innovation [6, 22, 23]. This yielded an initial composite framework of twelve performance dimensions. In this process, we identified credible dimensions vetted by scholars and practitioners that were relevant for comparing a variety of health and business models. In consolidating frameworks from different disciplines, we focused on selecting robust dimensions applicable to a broad range of programs.
Practice - feasibility
We next considered the performance measures that health programs are already reporting. We reviewed performance data reported by a purposive sample of 80 diverse, data-rich programs from the Center for Health Market Innovations (CHMI) database (healthmarketinnovations.org). This database catalogs over 1400 innovative health programs in LMICs, with an emphasis on private sector delivery (this includes for-profit, not-for profit, and public-private partnership (PPP) initiatives that serve poor populations in LMICs), and displays reporting provided by the programs. We determined the 80 programs for the sample by focusing on programs with available data in four important areas of health activity: the established fields of maternal, newborn and child health (MNCH), general primary care, and infectious diseases, plus the emerging area of mHealth. We supplemented the data available on these 80 programs in the CHMI database by collecting data from publically available sources through an online search of program websites and reports, journal articles, and news websites.
In our review of these 80 programs, our aim was to determine the types of measures programs are already reporting, to assess feasibility, while maintaining comparability by identifying common measures reported by a range of programs from different health areas with different models. The assessment included programs operating in diverse health areas, such as MNCH, eye care, tuberculosis, primary care, family planning and reproductive health. The programs commonly employ innovative operational models, such as social franchising, public private partnerships, clinic chains and networks, mobile clinics, social marketing, microinsurance, and use of mobile health technologies. Through this review, we identified performance dimensions in our initial composite framework that a variety of innovative health programs also are reporting data on, updating our framework to reflect this aspect of feasibility.
We then refined our initial framework by reviewing the relevant literature on each of the performance dimensions, including academic publications and technical reports. This review sought to strengthen the definitions and measurement approaches in a way that provides a relevant balance of our three desired characteristics:
Credibility: Consistent with ideas commonly presented in the literature
Feasibility: Based on existing reporting, requiring limited time and effort to provide data
Comparability: Programs engaging in different health areas and models could report on the dimension
Results and discussion
Through this process, we developed the T-HOPE framework, which includes three categories of performance – health status, health access, and operations/delivery. Within the three categories, there are fourteen subcategories of performance: three fields with definitions for health status, three for health access, and eight for operations/delivery. Table 1 summarizes the framework, providing definitions, indictors, and examples of each dimension. We also drew from the literature to identify seven descriptive fields, which Table 2 summarizes. The descriptive fields are useful for building profiles and understanding the context of specific programs.
Table 3 reports the frequency of reporting for each performance dimension by the 80 CHMI programs in our sample (i.e., the proportion of the 80 programs that report data for each framework dimension). The table also disaggregates the frequency of reporting based on subgroups for health area, type of innovation, and legal status. While there is substantial variation across subgroups, a large majority fall within the 50 % range around the mean reporting frequency value for each of the 14 performance dimensions.
This framework can be used to understand a program’s performance, including its activities, goals, and organizational context. The dimensions are framed and defined in a manner that balances comprehensiveness with comparability across diverse programs. By systematically applying the criteria in the framework, diverse stakeholders including program managers, funders, and researchers may achieve an understanding of relative program performance.
To illustrate the framework, Tables 4, 5 and 6 compare ten programs, two providing eye care services, five in mHealth, and three in MNCH. Together, the ten cases provide comparison for all fourteen categories in the T-HOPE framework. We summarize the comparisons here, in terms of their implications for funders, researchers, and program managers.
Eye care service comparisons
Table 4 compares the performance dimensions for two facilities that provide cataract surgeries, including Program Eye Care 1, a for-profit program in Latin America, and Program Eye Care 2, a not-for-profit program in South Asia. Several implications arise for different types of stakeholders.
Funders : Funders can use the comparison to help determine high opportunity investments, based on the strength of the factors that a given funder believes are most relevant for its goals. In this example, a funder focused on primarily serving disadvantaged populations may choose to fund Program Eye Care 1 given that a greater proportion of its patients are poor or, instead, might provide funding to Program Eye Care 2 to help it serve a larger number of poor people, even if the proportion is smaller.
Researchers : Scholars can use the comparison to research innovation and performance, such as exploring how different aspects shape program performance, including the operating context (Latin America vs. South Asia, rural vs. urban), legal status (for-profit vs. not-for-profit), and model infrastructure (hub and spoke vs. hospital).
Program managers : Program managers, meanwhile, can use the comparison to identify opportunities to learn new skills and techniques. For instance, Program Eye Care 1 might seek to understand how Program Eye Care 2 grew its population coverage and learn from Program Eye Care 2’s efficiency in performing cataract surgeries.
Table 5 compares the performance dimensions of five programs using mHealth, including Program mHealth 1, a for-profit hospital using management software in South Asia, Program mHealth 2, a not-for-profit telemedicine program in South Asia, Program mHealth 3, a not-for-profit mobile monitoring program in SubSaharan Africa, Program mHealth 4, a not-for-profit medical center and call center in South America, and Program mHealth 5, a PPP operating clinics with telemedicine services in South Asia.
Funders: Funders such as investors may be particularly interested in partnering with Program mHealth 1, which has shown strong revenue and profits through its financial model, as well as strong performance in non-economic efficiency and management quality as evidenced by its ISO 9001-2008 certification. Donors may want to support the efforts of Programs mHealth 2 and 4, which have achieved substantial scale in providing affordable and efficient health services. Donors interested in helping a medically successful program that needs financial support may be drawn to Program mHealth 3. Public agencies and policy makers looking for PPP models may want to explore Program mHealth 5’s successful approach to partnership.
Researchers: Researchers may be interested in exploring how Programs mHealth 1 and 2 are able to serve many more patients per day than other local options and the types of procedures that are amenable to this. They may want to study how these programs, both for-profit and not-for-profit, have been able to develop relationships with government entities to deliver their programs, and the advantages and challenges of doing so. Researchers may also want to study how Program mHealth 5 has contributed to improvements in local health outcomes.
Program managers: Program managers may be interested in learning how Programs mHealth 2 and 3 are able to achieve high satisfaction ratings with patients, and how to scale up services to serve the large numbers of patients Programs mHealth 1 and 2 are able to serve. Program managers may also be interested in learning about the value proposition that Program mHealth 5 has used to gain substantial financial support from public bodies.
Maternal, Newborn, and child health (MNCH) comparisons
Table 6 compares the performance of three MNCH programs, including Program MNCH 1, a for-profit hospital chain serving women and children in South Asia, Program MNCH 2, a not-for-profit clinic franchise focusing on MNCH and reproductive health in South East Asia, and Program MNCH 3, a not-for-profit clinic franchise offering MNCH and general primary care services in South Asia.
Funders: Funders may be particularly interested in the couple years of protection (CYPs) generated by programs and the ability for Program MNCH 3 to provide CYPs at a relatively low cost, choosing to support programs that are able to produce health outcomes most cost effectively.
Researchers: Researchers may be interested in understanding how MNCH 1 has influenced the health behaviours of pregnant women. Given that Programs MNCH 2 and 3 are franchises, scholars may also want to explore how Health Outcome, Clinical Quality, User Satisfaction, and Management Quality compare with non-franchised MNCH programs.
Program managers: Program managers may find the data on Human Resources Supply particularly relevant, including MNCH 1’s efforts to employ non-physician health workers to keep costs low, the types of training provided by Program MNCH 2 for its franchisees, and reasons for staff turnover in Program MNCH 3’s franchise model.
In these examples, the framework data give a snapshot of performance information about each program, and provide an entry point for funders, researchers, and program managers to conduct preliminary comparisons and identify avenues for further investigation. Applied at regular intervals, these performance dimensions can also help track program performance over time, providing a richer understanding of the program’s capabilities and potential. As well, to understand program performance, one must also have knowledge of program operations, goals, challenges, and processes that shape this performance; the descriptive fields framework offers relevant information that complements the T-HOPE performance framework.
One of the key strengths of this framework is the integration of established approaches for measuring the performance of health programs and organizations. The wide variety of tools used today creates confusion, puts an inappropriate burden on delivery organizations, and fails to achieve comparability. Delivery organizations in LMICs with limited resources often have difficulty meeting the monitoring and evaluation demands placed on them by different donors, suggesting the need for greater coordination on reporting requirements and simplified measures [24, 25]. By harmonizing measurement requirements, funders may implement more effective pan-organizational strategies for achieving targeted health outcomes while reducing the reporting burden on the organizations they fund .
This framework can be used to highlight and compare the performance of innovative health programs for various stakeholders. However, while providing a snapshot of program performance at a moment in time, it will be of greatest value when combined with descriptive information about program activities, goals, and context that shapes this performance. It can provide an even richer understanding of program performance if applied over time to track progress. Also, while the framework can facilitate comparison of performance amongst programs and over time, given the diversity of innovative models emerging, we have not included benchmarks for the example indicators of our performance dimensions. Benchmarks will vary by health area and operational model, and program managers and others can identify whether their programs are meeting accepted standards.
While we have endeavored to develop credible, feasible, and comparable performance measures, some of the framework criteria are structurally more difficult to measure than others, as Table 3 highlights. For example, Population Coverage requires an accurate, quantified measure of a program’s target population, which may not be readily available in resource-limited settings without birth registration and accurate census information. Measuring Pro-Poor Targeting may involve complex and multidimensional considerations for identifying poor patients . Assessing Health Outcome, meanwhile, may be challenging and time consuming, involving tracking patient health status after the intervention ; this may involve impact evaluations, requiring advance planning, additional funding, and rigorous research designs to ensure the results are attributable to the program, a research approach relatively few social development programs have been able to carry out .
We have included these performance dimensions in the framework because they are considered critical for assessing impact in the literature [29–31]. We have aimed to provide simple and straightforward definitions and example indicators based on the reporting of programs in the CHMI database. However, some dimensions may require additional information and knowledge that is not as easily accessible for new and small-scale programs as it is for large-scale, established ones. Greater technical and financial support is needed from stakeholders such as funders and researchers to assist program managers with reporting on this valuable data [7, 28]. In addition, further field-testing of the framework can help to refine these performance dimensions so they are more attainable for program managers and also help to identify more feasible methods for program managers to access this information in resource-constrained contexts.
Despite these limitations, the development of an integrative framework that acknowledges and balances the tradeoffs between credibility, feasibility, and comparability is urgently needed. This could benefit programs interested in understanding and communicating their activities and accomplishments; funders making decisions on which programs to support; and researchers seeking to better understand performance of innovative health care delivery models and programs. This framework also aims to encourage greater discussion on the types of metrics needed to meaningfully and cost-effectively understand program performance, identifying areas for improvement and opportunities for further collaboration and discourse amongst different groups with shared interests in global health.
The T-HOPE framework is designed to cultivate the adoption of performance measures that meet the needs of diverse programs, while encouraging collaboration, coordination, and sharing of knowledge among programs, funders, and researchers. In doing so, the framework provides an important step towards accurately and realistically assessing the health impact and sustainability of programs aiming to meet the needs of the poor.
In practice, this framework has been incorporated into CHMI’s Reported Results initiative . Through this initiative, programs can display public profiles with reporting on selected performance dimensions. The T-HOPE approach has also informed the Impact Reporting and Investment Standards’ (IRIS)  health working group of the Global Impact Investment Network in the development of a core set of health metrics for social enterprises. The resulting IRIS metrics, while focused on a small number of process measures that are pertinent to clinics and hospitals, have been selected to enhance comparability. In parallel, the more comprehensive T-HOPE framework allows for comparisons across a wider range of program types, and may be used to describe tradeoffs between quality, cost, and accessibility. Thus, the approaches are complementary: IRIS metrics may be used to scan for promising activities among hospitals and clinics, while the T-HOPE framework can be used to structure in-depth analyses and comparison of health programs.
The collection of credible, feasible, and comparable information on health organization performance is essential for identifying effective and innovative approaches to delivery. By understanding and comparing the performance of health programs, we can better determine which models are generating innovations that create health impact and real value in LMICs. Such understanding is crucial to progress.
Best corrected visual acuity
Center for Health Market Innovations
Couple years of protection
Impact Reporting and Investment Standards
Low- and middle-income country
Maternal, Newborn, and Child Health
Toronto Health Organization Performance Evaluation
Bloom G, Henson S, Peters DH. Innovation in regulation of rapidly changing health markets. Global Health. 2014;10:53.
Dandonoli P. Open innovation as a new paradigm for global collaborations in health. Global Health. 2013;9:41.
Binagwaho A, Nutt CT, Mutabazi V, Karema C, Nsanzimana S, Gasana M, et al. Shared learning in an interconnected world: innovations to advance global health equity. Global Health. 2013;9:37.
Hanson K, Berman P. Private health care provision in developing countries: a preliminary analysis of levels and composition. Health Policy Plan. 1998;13:195–211.
Swanson RC, Atun R, Best A, Betigeri A, de Campos F, Chunharas S, et al. Strengthening health systems in low-income countries by enhancing organizational capacities and improving institutions. Global Health. 2015;11:5.
Bhattacharyya O, Khor S, McGahan A, Dunne D, Daar AS, Singer PA. Innovative health service delivery models in low and middle income countries - what can we learn from the private sector? Health Res Policy Syst. 2010;8:24.
Bennett S, Lagomarsino G, Knezovich J, Lucas H. Accelerating learning for pro-poor health markets. Global Health. 2014;10:54.
Hanson K, Gilson L, Goodman C, Mills A, Smith R, Feachem R, et al. Is private health care the answer to the health problems of the world’s poor? PLoS Med Public Library of Science. 2008;5:e233.
Schweitzer J, Synowiec C. The economics of eHealth and mHealth. J Health Commun. 2012;17:73–81.
Mills A, Brugha R, Hanson K, McPake B. What can be done about the private health sector in low-income countries? Bull World Health Organ. 2002;80:325–30.
Howitt P, Darzi G, Yang G, Ashrafian H, Atun R. Technologies for global health. Lancet. 2012;380:507–35.
Bennett S, Bloom G, Knezovich J, Peters DH. The future of health markets. Global Health. 2014;10:51.
Kalinichenko O, Amado CAF, Santos SP. Performance assessment in primary health care: a systematic literature review. Faro: CEFAGE-UE; 2013.
World Health Organization (WHO). Monitoring the building blocks of health systems: a handbook of indicators and their measurement strategies. Geneva: WHO; 2010.
De Savigny D, Campbell AT, Best A. Systems thinking: what it is and what it means for health systems. In: De Savigny D, Adam T, editors. Systems Thinking for Health Systems Strengthening. Geneva: WHO; 2010. p. 37–48.
Bradley EH, Pallas S, Bashyal C, Berman P, Curry L. Developing strategies for improving health care: guide to concepts, determinants, measurement, and intervention design. Health, Nutrition and Population (HNP) Discussion Paper. Washington: World Bank; 2010.
Kelly E, Hurst J. Health care quality indicators project: conceptual framework paper. OECD Health Working Papers No. 23. Paris: OECD Publishing; 2006.
Donabedian A. Evaluating the quality of medical care. Millbank Q. 2005;83:691–729.
Clark C, Rosensweig W, Long D, Olsen S. Double bottom line project report: assessing social impact in double bottom line ventures. Methods catalog. Berkeley: Center for Responsible Business; 2004.
Global Impact Investing Network (GIIN). IRIS metrics. 2012. https://iris.thegiin.org/metrics. Accessed 20 October 2012.
Kaplan RS, Norton DP. The balanced scorecard: measures that drive performance. Harv Bus Rev. 1992;70:71–9.
CHMI. Performance measurement. 2015. http://healthmarketinnovations.org/chmi-themes/performance-measurement . Accessed 12 January 2015.
Ojha NP, Ghosh P, Khandelwal S, Kapoor H. Innovation overview. Bus Today. 2011;53–56.
Ebrahim A. NGOs and organizational change: discourse, reporting, and learning. Cambridge: Cambridge University Press; 2005.
Bornstein L. Systems of accountability, webs of deceit? Monitoring and evaluation in South African NGOs. Development. 2006;49:52–61.
Yang A, Farmer PE, McGahan AM. “Sustainability” in global health. Glob Public Health. 2010;5:129–35.
Hulme D, Moore K, Shepherd A. Chronic poverty: meanings and analytical frameworks: CPRC Working Paper 2. Manchester: Chronic Poverty Research Centre; 2001.
Savedoff W, Levine R, Birdsall N. When will we ever learn? Improving lives through impact evaluation. Report of the evaluation gap working group. Washington, D.C: Center for Global Development; 2006.
Jee M, Or Z. Health outcomes in OECD countries. Paris: OECD Publishing; 1999.
Shengelia B, Murray C, Adams O. Beyond access and utilization: defining and measuring health system coverage. In: Murray C, Evans D, editors. Health systems performance assesment: debates, methods, empiricism. Geneva: World Health Organization; 2003. p. 221–34.
Patouillard E, Goodman CA, Hanson KG, Mills AJ. Can working with the private for-profit sector improve utilization of quality health services by the poor? A systematic review of the literature. Int J Equity Health. 2007;6:17.
Global Impact Investing Network (GIIN). Healthcare Delivery. 2015. https://iris.thegiin.org/health-metrics. Accessed 12 January 2015.
Sixty-Second World Health Assembly. Prevention of avoidable blindness and visual impairment: report by the Secretariat. Geneva: WHO; 2009.
This article is based on research conducted by the Toronto Health Organization Performance Evaluation (T-HOPE) team at the University of Toronto under contract with Results for Development Institute (r4d.org). Anita McGahan, Will Mitchell, Kathryn Mossman, John Ginther and Raman Sohal are also supported by Canadian Social Sciences and Humanities Research Council (SSHRC) Grant #435120102. The authors of this article are responsible for its contents. No statement in this article should be construed as an official position of Results for Development Institute or SSHRC. The funders had no role in the study design, data collection, analysis, or decision to publish.
The authors thank Daniela Graziano for her assistance in reviewing the manuscript.
The authors declare that they have no competing interests.
OB, KM, JG, LH, RS, JC, AB, JM, HP, IS, WM, and AM conceived of and participated in the design of the study and reviewed existing performance frameworks to develop a composite framework. KM, JG, LH, RS, JC, AB, JM, HP, and IS collected program performance data for the programs under review. KM, JG, RS, JC, and AB analyzed the program data to refine the framework. OB, KM, JG, LH, WM, and AM drafted the manuscript. OB, KM, JG, LH, RS, JC, AB, JM, HP, IS, WM, and AM critically reviewed the manuscript. All authors have read and approved the final manuscript.