This study was conducted by a multidisciplinary team representing computer engineering/science, clinical sciences, and public health. Therefore, our interpretations cover two distinct areas: lessons about the application of the ML method and implications for maternal health service provision. We discuss both in turn.
This study explored the potential for using ML models to predict the perception of being safe in the workplace among maternal and newborn healthcare providers during the COVID-19 pandemic. Our analysis shows that ML models perform better than conventional statistical methods in terms of accuracy and margins of error. This was the case for all the models across different experiments, with the RF, XGBoost and CatBoost being the most robust models. By analysing the confusion matrices of the Logistic Regression model and the RF model from experiment 1A (classification with all features), we notice that (1) ML models (and particularly RF in this case) have overall a better performance when compared to the conventional techniques and are less likely to make large class prediction deviation, and (2) the likelihood of misclassification errors in the prediction process increases as we move to the middle class (i.e. class 3). This has an important significance on the interpretation of the Likert scale output, as the feeling of being protected is a subjective perception that results, just like any other human perception, from the complex interaction between environmental, genetic, biologic, and psychosocial factors, and this complexity is difficult to capture accurately in surveys. Despite that, ML models, due to their architecture and algorithms, are capable of more accurately capturing these interactions and this explains why the number of erroneous predictions is lowest for individuals belonging to class 0 (Not at all protected) and 5 (Completely Protected), and highest for individuals belonging to class 3 (Some Protection), because the former are certain about their feeling while the latter have already a certain level of uncertainty.
Experiment 1B (classification with selected features) also shows that some ML models (RF, XGBoost and CatBoost) are capable of making accurate predictions when trained on a small number of features without losing much accuracy, which is not the case for conventional statistical models. This is particularly important because it allows the use of such a tool to screen for the perception of feeling protected among healthcare providers without needing to collect a large number of features (fewer questions).
Experiments 2A (regression with all features) and 2B (regression with selected features), on the other hand, attempt to solve the same problem using regression. These experiments are implemented for several reasons. First, by considering the output as a continuous variable, we are capable of representing the perception of being protected as a spectrum which is more realistic than the discrete categories. Second, this allows to quantify the exact amount of error at the individual level to avoid under or overestimation of the model’s performance. For instance, if the classification model predicts 2 instead of 3, we cannot detect how far the model was from making the correct prediction, whereas in the regression model, we are able to quantify the error. Third, re-iterating the problem using a different ML model, contributes to confirming the validity of the models when similar results are obtained from the various models; which was the case in this study. The results of the experiments show that even when the problem is solved using regression, ML models are more robust at making the predictions than conventional techniques, with a mean error of 0.5 class.
By applying the RF algorithm, we are able to extract and rank features by the extent to which they contribute to the prediction of healthcare providers’ feeling of protection in the workplace. The findings from both experiments were cross-validated by comparing the features’ rankings between both experiments. The top ten features in both experiments 1A (classification with selected features) and 2A (regression with selected features) were classified in three main themes: (1) information accessibility, clarity and quality; (2) availability of support and means of protection; and (3) COVID-19 epidemiology at the national level. The three themes are discussed below in detail.
1—Information accessibility, clarity and quality
Features belonging to this theme include healthcare providers’ knowledge on what to do in case of having a COVID-19 maternity case (ranked 1 and 2, respectively, in both experiments), and healthcare providers’ perception of the information that they received from the facility regarding COVID-19 and maternity care (in terms of value in feeling safe, helpfulness in daily work, and clarity). This suggests that access to information and knowledge, particularly clear information and feasible recommendations, plays a key role in the morale of maternal and newborn healthcare providers. Our results also highlight that the quality of the information received relative to each healthcare providers’ needs and perceptions, has an important contribution to healthcare providers’ attitudes and wellbeing. Previous studies, at global and national levels, show that healthcare providers struggled with the lack of knowledge, guidance and prevailing uncertainty during the early days of the pandemic [15, 17, 30]. Particularly in the case of maternity care, global guidelines and recommendations took some time to be established, and evidence regarding the risk of COVID-19 for women and newborn continues to emerge to this day . This lack of clarity can be stressful for those providing care to women and newborns in these uncertain circumstances , and be translated as perceptions of unsafety when providing care. On the other hand, some facilities established clear guidelines on referring women with confirmed COVID-19 to other facilities or to COVID-19 treatment centres. This could have contributed to a perception of low exposure to COVID-19 risks among healthcare providers working in the referring facilities and consequently a perception of protection in the workplace. Future studies exploring whether differences in perception of protection exist between healthcare providers who work in facilities that refer COVID-19 obstetric cases and those who treat them on site.
2—Availability of support and means of protection
Two main features were grouped to represent the support received from the health facility where healthcare providers work: whether the facility addressed their concerns (ranked 2 and 1, respectively, in both experiments), and the availability of sufficient PPE (masks and aprons). Healthcare providers are a core building block of the healthcare system, and providing quality care can only be achieved when human resources are empowered and supported. The healthcare system must be responsive and adaptive to the needs of its workforce and therefore able to address their concerns and worries, regularly and in times of crises . Globally, PPE shortage was a significant issue in the early days of the pandemic for all cadres of healthcare providers. Essential healthcare providers such as maternal and newborn care workers who were not caring directly for COVID-19 patients, may have experienced this shortage more acutely, as they might not have been prioritised to receive PPE and had to continue providing clinical care. Research showed that this was a source of concern for maternal and newborn healthcare providers as many of them worried about their own safety and becoming infected with COVID-19 in the workplace as a result of the lack of PPE [9, 12, 14, 17]. Additionally, the mere availability of PPE is not sufficient, and maternal and newborn healthcare providers must have access to appropriate support and training on PPE use. This includes training on adequate donning and doffing, as well as learning to provide empathic care while wearing them [14, 47]. In our survey, these questions were specific to support received from the health facility where respondents worked. Nonetheless, it is worth mentioning that the support that health facilities can provide is conditional upon the support and resources that facilities receive from higher structures in the healthcare system, nationally and globally. For example, facilities cannot ensure PPE availability to care providers if there is a national and global shortage. Additionally, facilities cannot communicate guidelines and information to frontline care providers unless those have been officially issued by health authorities. Therefore, the interpretation of these features as a responsibility of health facilities should be made with caution, as we consider the responsiveness of health facilities to be a mere reflection of the responsiveness of the healthcare system.
3—COVID-19 epidemiology at the national level
Features grouped under this theme represent the level of spread of the COVID-19 outbreak at the country level including the cumulative number of COVID-19 cases and deaths due to COVID-19 and the daily number of cases reported on the day of data collection. Our results show that the extent of the transmission of the virus contributes to the prediction of healthcare providers’ perception of protection in the workplace. Healthcare providers, much like the rest of the community, are sensitive to these kinds of changes at the national level, and it is reflected in their attitudes in the workplace. The higher the number of COVID-19 cases and deaths in the community, the higher the likelihood of having to provide care to women confirmed or suspected with COVID-19. This influences the level of risk perceived by healthcare providers and their perceptions of being protected in the workplace. These values are publicly available data at the national level, making the prediction of the output at the individual level easier to achieve.
Least contributing factors
Further analysis reveals that restriction measures applied at the national-level are among the least contributing factors to the prediction of the outcome. In a previous analysis using qualitative data from the same survey conducted at a time point further into the pandemic, we identified that maternal and newborn healthcare providers’ perception of being safe was linked to the extent of the COVID-19 restrictions applied at the country-level . However, the results from the current quantitative analysis contradict our qualitative findings. This shows that ML analysis, although can be valuable in informing a rapid response, can be supplemented by qualitative data in order to represent a clearer, more in-depth assessment of the wellbeing of healthcare providers in emergency situations. The country-income group also had a minimal contribution in predicting healthcare providers’ safety feeling. This highlights the need to consider healthcare providers’ wellbeing in various context, particularly considering the gap in research conducted in low- and middle-income countries on this issue. Some facility-level characteristics, such as the reception of referrals or the presence of an intensive care unit were also among the least contributing factors to the outcome. Although higher level facilities have been given the responsibility to handle COVID-19 cases in many countries, healthcare providers in lower level facilities have had similar experiences of safety perception as those working in higher level facilities. The gender of the healthcare providers was also a minimally contributing factor to the perception of safety feeling. This finding may warrant further exploration in future studies designed to unpack gendered differences in the impact of the pandemic on maternal and newborn healthcare providers, the majority of whom are women.
Strengths and limitations
This is one of the first studies that uses ML to develop an algorithm that predicts maternal and newborn healthcare providers’ feeling of protection in the workplace during the early phases of the COVID-19 pandemic, using data collected through an online survey. This work is one of the few applications of ML models to subjective survey data, and despite the large number of limitations and assumptions associated with analysing “perceptions and opinions” quantitatively, the results are promising and the method has a relatively high level of accuracy (81%).
Nonetheless, with the application of ML in public health research, the results must not be taken at face value, and must be interpreted with caution . To ensure the relevance of our findings beyond numbers, and to confirm the validity of the applied methods we adopted two approaches: (1) a cross-comparison of the features identified in two experiments, which shows that most features exist in the top 10 across the two experiments (convergent validity); and (2) and a thorough qualitative interpretation of the top-ranked features contributing to the prediction of the output in light of pre-existing literature and knowledge, which supported/confirmed the conceptual validity of the tool. This process highlights the importance of the multidisciplinary collaboration between computer engineering/science and public health, which leveraged the value of the work and validated the findings from different perspectives.
One possible limitation of our work is that additional features that could have contributed to the prediction of the output were absent from the analysis. This includes information that was not initially collected in the survey such as personal features (e.g. age, years of experience, experience with previous outbreaks and disruptive events) and individual risk-factors for COVID-19. Other information was collected in the survey but in an open-ended manner, and therefore were not included in this analysis, such as being re-assigned to COVID-19 treatment wards, being diagnosed/suspected to have COVID-19, colleagues diagnosed with COVID-19 or the number of deaths due to COVID-19 among healthcare providers at the country level, etc. Future applications of this tool should consider expanding the list of features, including an additional feature on the availability of COVID-19 vaccines to healthcare providers.
The study’s sampling technique and online data collection meant that the data are not representative of the healthcare provider population, and we acknowledge the potential of a selection bias given that there was no sampling frame for the global study participants. Additionally, many respondents to the original sample were excluded from the final analysis because they had incomplete fields or missing information, which could have affected the sampling bias. Information bias could also exist in the data, particularly related to the quality of reporting national estimates of the number of COVID-19 cases and deaths.
The scope of our work and survey and research area is limited to maternal and newborn healthcare providers. There is potential to evaluate such advanced methods in research related to other cadres of healthcare providers, including those who are at the frontline of providing care to COVID-19 patients.
This study provides factors that predict the perception of safety among a global sample of healthcare providers who work in different settings. It was not possible to assess context-specific factors that could predict the outcome differentially based on the country setting or income-group because of the small size of the sub-samples. Future developments of ML models at the country-level can unpack context-specific factors that can be addressed at the local level, particularly for low- and middle-income countries.
Finally, it is important to mention that we do not underestimate the utility and importance of conventional techniques, but rather embrace both techniques and take advantage of their strengths based on the problem to be solved. For instance, for some problems with small datasets, conventional techniques offer a fast and cost-effective solution, whereas for complicated problems with large datasets and nonlinear interaction between different variables, machine learning algorithms might offer a better alternative.