Design and setting
We conducted a two-arm quasi-experimental pre-post evaluation study, comparing different 9-month intervention packages to establish their relative impact on parenting skills and environment among a cohort of caregivers of children under age 3 years in Tabora region, located in central-western Tanzania and home to a predominantly rural (87%) population and agricultural economy [34]. In Tanzania, about 8 in 10 adults are literate (77% of women and 83% of men), with about half having completed primary education, and 23% of women and 28% men having secondary or higher education [30]. The study included four districts purposefully divided into two intervention groups. The first group (Kaliua, Uyui districts) was exposed to the minimal intervention package, composed of radio messaging (RM) only. The second group (Nzega, Igunga districts) was exposed to the Malezi II full intervention package, composed of radio messaging, the introduction of short video job aids primarily for CHW use, and the CCD program (RMV-ECD), first implemented under the Malezi I phase and continued under Malezi II.
Intervention
Radio messaging (RM)
ECD radio messages (37 different spots) were aligned with Government policy and community tested prior to being aired on the three most popular radio stations in Tabora at least 10 times per day from March to December 2020. These spots focused on the importance of playing, talking with, and praising young children, using positive discipline, and the importance of both mothers and fathers interacting with young children. Radio messaging was targeted to reach all four study districts as much as possible, though coverage of some stations varied by district and proximity to urban centers.
CCD program (RMV-ECD)
In addition to radio messaging, caregivers in this arm were also allocated to a CHW for monthly household visits. These CHWs received training on CCD and on how to use five short (5–6 minutes) ECD video job aids loaded onto electronic tablets in individual home sessions. The short videos were also used during group counseling sessions at clinics but this was not limited to the RMV-ECD arm caregivers. Videos were produced in Tabora and showed local caregivers and community health workers demonstrating recommended ECD practices (Swahili videos with English subtitles can be viewed at https://www.developmentmedia.net/project/malezi-ii/). Four of the videos concentrated on nurturing care practices specific to age groups (0–6 months, 6–12 months, 12–24 months, 24–36 months) and one video covered cross-cutting issues, applicable to all ages. Intervention fidelity monitoring data were collected monthly to document completion of CHW home visits to assigned caregivers.
All ECD radio messaging and video (job aid) content was developed through an iterative process informed by target community members including mothers, fathers, and elders who are influential stakeholders with regard to perpetuating community and family norms around child-rearing. Content development was led by the Development Media International (DMI) who deployed behavioral media specialists to conduct focus group discussions (FGD) aiming to understand how the content is being received by the target audience of caregivers; and investigate whether individuals who have heard the ECD content can identify the norms and behaviors that it is aiming to address.
Sample and sampling procedures
As the comprehensive program intervention was primarily delivered by CHW who were affiliated with health facilities, we purposefully selected 31 health facilities located in 29 administrative units called wards (6–8/district). From these wards, 75 National Bureau of Statistics census enumeration areas (EA) were randomly sampled proportional to population size. In order to minimize selection bias in participant recruitment, the study team aimed to enumerate all households in sampled EAs, listing potentially eligible households if there was a resident adult (> 18 years) who was a primary caregiver of a child aged 0–24 months and who intended to remain in the same area for at least 1 year, and was willing to be home-visited by a CHW. From these listed households, only one primary caregiver per household was recruited by a study enumerator. Caregivers who were not able to provide written informed consent due to a cognitive impairment or language barrier; or who were the primary caregiver of an index child with a congenital anomaly or other disability; or who worked as a CHW or medical provider, were excluded from the study.
We estimated that a sample size of 430 caregivers per intervention group would provide 90% power to detect a 15% difference between the RM and RMV-ECD intervention groups at endline, and > 80% power to detect at least a 5% change in each intervention group between baseline and endline, at a 5% significance level. Of 8880 households enumerated, 1248 caregivers were recruited into the study and interviewed at baseline (October–December 2019). Among these, 12 were withdrawn (10 refused after enrolment; two were excluded after being found ineligible for the study); one caregiver died; and 184 caregivers moved out of the area. Of the remaining (n = 1051) eligible for follow-up, 47 (4%) were not available for interview after several attempts, and 1004 (96%) were successfully interviewed at endline (January–March 2021; Fig. 1), which was 17% higher than our minimum required sample size. Almost all (n = 985; 98%) caregivers interviewed at endline remained the primary caregiver of the index child from baseline. Of the 19 caregivers whose index child had died or moved from the household, eight nominated an eligible “replacement” child under 3 years and 11 completed a partial interview skipping questions that were no longer applicable.

Data collection and study variables
Baseline and endline structured questionnaires were administered via interview in a private place in or near the consenting caregiver’s home in the national language (Swahili). Interviews assessed exposure to the intervention and caregiver knowledge/practices using questions tailored to the study intervention. Responses were based on caregiver self-report and interviewer observations and categorized according to five outcome variables reflecting caregiver knowledge, stimulation practices, father engagement, responsive care, and household environment safety. Continuous scores for each variable were dichotomized at the median for analysis. Some scores, where the number of items differed by age of the child (early stimulation) or sub-group (environment risk), were standardized to a 0–1 scale by dividing the raw score by the number of items.
Age-appropriate ECD knowledge was assessed from six questions (scoring 0–6 points) asking the caregiver to describe one specific way that a “caregiver can support a child’s mental, emotional or physical development …” during pregnancy, from birth to 6 months, from 6 to 9, 9–12, and 12–24 months and 2–5 years of age. Caregiver responses were recorded by the interviewer verbatim and coded by the co-Principal Investigator (JF) as correct or incorrect. Non-specific responses such as “seeking health care” were not considered correct.
Early stimulation practice and father engagement measures were adapted from questions originating from UNICEF’s Multiple Indicator Cluster Surveys (MICS), a widely validated survey used in over 100 countries over the past two decades [35,36,37]. Caregivers of children under 7 months could score up to three points for reporting that the mother, father, or other adult engaged the child in singing songs, taking the child outside, or playing with the child in the past week. Caregivers of children over 7 months were asked three additional items (read books, told stories, name/count things with child) for a total of six points. The sum of the items for these two measures were then standardized to a 0–1 scale.
Responsive care was defined for caregivers of children over 7 months based on interviewer observations of how the caregiver engaged with the child during the interview. This measure had a high proportion of missing data (21% at baseline, 24% at endline) due to children being too young, sleeping or absent during the interview. The four items totaling up to six points included helping the child keep busy (0, 1), pointing out objects/naming things (scored as 0, 1), recognizing when the child needs help with something (0,1,2), and keeping the child in view at all times (0,1,2).
Household and neighborhood environment safety risks were assessed by interviewer observation of the inner and outer household areas, where risks were grouped by community (nearby road, bar/market, ditches); outside compound (open water source, unpenned animals, accessible sharp tools, chemicals or flammable materials, and unprotected cooking area); and inside household (accessible electric, medicine or cleaning chemicals, inappropriate toys). The environment safety outcome score was standardized (0–1) to adjust for the different number of items in each group.
Several variables were explored to describe potential and actual exposure to the intervention. Radio ownership and recent CHW visits were assessed at baseline and endline in both study groups. Radio message content recall and frequency of radio listenership were assessed at endline in both study groups. Exposure to the intervention videos through home and facility visits was assessed in the RMV-ECD arm only at endline, and overall number of CHW visits was assessed in both study groups.
Other predictor and mediator variables including social-demographic characteristics included history of child illness/injury, health care utilization, parental discipline practices/beliefs, parenting stress, and depressive and anxiety symptoms. Health care utilization and parental discipline measures were adapted from the UNICEF MICS tool, where the discipline assessment contained 11 items, each answered “yes” or “no,” and divided into four sub-groups: psychological, physical, severe physical and positive (non-violent) disciplinary practices by anyone in the household in the past month. The eight violent discipline items comprise the measure on violent discipline, and the three remaining non-violent items comprise the “positive discipline” measure. Discipline scores were standardized to a scale of 0–1.
The Parenting Stress Index (PSI) scale is a widely used and robust measure of three parenting related domains: Parental distress, parent-child relationship dysfunction (e.g. quality of relationship), and the extent to which the caregiver perceives the child is difficult [38]. The scale is composed of 36 statements (12 per sub-scale domain) which are scored 1 (strongly disagree) to 5 (strongly agree) and can be summed to reflect the total score for each domain. The PSI-36 total score is a composite score of the three subscales (scores range 36–180) with higher scores (or a cut-off of 90) indicating higher parental stress.
Depressive symptoms were measured using the 9-item Patient Health Questionnaire, a tool widely used in resource-limited settings and recently validated in a Tanzanian primary care population, showing 78% sensitivity and 87% specificity in detecting depression when compared to a gold-standard psychiatric assessment [39]. The General Anxiety Scale, often used in low to middle income country settings [40, 41], measures the seven criteria of anxiety in the Diagnostic and Statistical Manual of Mental Disorders, establishing a provisional anxiety diagnosis and assessing symptom severity [42].
Statistical analysis
We summarized caregiver characteristics at baseline, and measures of covariates and intervention exposure at baseline and endline using frequencies and percentages for categorical variables and means, medians, interquartile ranges (IQR), and standard deviations (SD) for continuous variables. Participant characteristics and baseline outcome measures in the two intervention arms were compared using Chi-Square or Rank Sum tests. We limited all analyses to those who were followed at endline, after conducting an attrition analysis which showed minimal statistically significant differences between those excluded and included (data not shown).
Outcomes associated with the intervention were assessed using McNemar’s test and further described by the proportion of caregivers who improved from having a poor outcome score at baseline to having a good score at endline, with 95% confidence intervals (CI) to allow comparison of interventions by study arm.
Unadjusted and adjusted logistic regression models that accounted for sampling weights and clustering by sampling unit were used to describe the association between the full intervention arm, (RMV-ECD) compared with the the radio-only (RM) arm on study outcomes, as well as other covariables of interest. Baseline status of outcomes of interest and child age were included in adjusted models. All data management and descriptive analyses were done using Stata 16.1; McNemar’s and regression analyses were done in SAS 9.4.
Ethical considerations
The protocol for this evaluation was approved by the National Research Ethics Committee of the Tanzania National Institute of Medical Research and the Advarra Review Board in the United States. Study interviewers obtained written informed consent from caregivers enrolled in the study prior to conducting the baseline interview.
link