Data sources and variables
We extracted data for facility-level workload (health service and other activities) calculations from the Access, Bottlenecks, Costs, and Equity (ABCE) project surveys that collected data for the fiscal years 2008–09 to 2012–13 in Madhya Pradesh (MP) [21], 2010–11 to 2014–15 in Gujarat (GJ) [22], 2009–10 to 2013–14 in Odisha (OD) [23], 2007–08 to 2011–12 in Tamil Nadu (TN) [24], and 2007–08 to 2011–12 in Andhra Pradesh and Telangana (AP&TG) [25] at rural PHCs and CHCs. The ABCE project used stratified random-sampling to create nationally representative facility data sets. Facilities in rural and semi/peri-urban localities from the survey were taken as rural. We focused on 8 centre-cadre combinations in rural areas—PHC-nurses, PHC-doctors, CHC-nurses, CHC-GDMOs, CHC-physicians, CHC-surgeons, CHC-OBGYNs, and CHC-paediatricians. These cadres have specific activities that they perform at PHCs and CHCs according to IPHS (Additional file 2). Cadre-specific workload components extracted from ABCE were segregated into health service activities (HSA) (e.g., outpatient visits, inpatient admissions, surgeries, deliveries, etc.), support, and additional activities (e.g., patient review meetings, outreach services, administrative meetings) performed by all or select staff members (Additional file 3A–C). The patient numbers depicted the total annual workload of a particular service provided at a healthcare centre. We also extracted facility-level loads for support and additional services (Additional file 3A–C).
For activity standards (time required to perform the activity), we referred previously conducted WISN studies in India (see Additional file 1 for study details and Additional file 3A–C for variables) followed by WHO–WISN Methods Guide [26]. Activity standards were collected for the HSA included for doctor and nurse cadres at PHCs and CHCs. The standards were converted to common units (Additional file 3A–C).
To project WISN estimates at state (i.e., states and union territories) and national levels, we used cadrewise data on ‘in-position’ (actual staff present) and ‘sanctioned’ posts (under NRHM based on IPHS norms defined as authorized or approved positions) from RHS 2019 [27]. The numbers of functional rural PHCs and CHCs were also extracted. States with missing or incomplete data were excluded from the analysis (Additional file 4).
WISN calculations for individual health centre facilities
We calculated annual available working time (AWT) in hours for each cadre according to
$${\text{AWT}} = \left[ {A - \left( {B + C + D + E} \right)} \right] \times F$$
(1)
where A, B, C, D and E are the numbers of working days in a year, annual leaves, sick leaves, public holidays and other leaves, respectively. F is the number of working hours per day. Values for leaves were taken from an existing WISN Indian study [10].
Standard workload represents the possible volume of HSA conducted by a health worker in a year. It was calculated by dividing AWT by the respective service activity standards. The annual workload was the actual number of patients seeking care under respective health services in that year. The required number of health workers for HSA was obtained by adding the ratios of annual workload to the standard workload for each health service.
Category allowance standard (CAS) expressed as percentage AWT spent, represents the activity standard for the given support activity of all staff members of a cadre. We used facility-reported actual working times and time standards from other sources (Additional file 3B). Total CAS percentage was the sum of individual CAS. Category allowance factor (CAF) is the multiplier that gives the required number of staff for health service and support activities. It was calculated as
$${\text{CAF}} = \frac{1}{{1 - \frac{{\text{Total CAS percentage}}}{100}}}$$
(2)
Individual allowance standard (IAS) represents the activity standard for a given additional activity of select staff members. IAS was the product of the time required to perform given additional activity and the number of staff members involved in the activity. We used the maximum value of actual working times reported among facilities. Total IAS was the sum of individual IAS. Individual allowance factor (IAF) is the staff required to cover additional activities and was calculated as
$${\text{IAF = }}\frac{{\text{Total IAS}}}{{{\text{AWT}}}}$$
(3)
The WISN-based required number of staff of an HRH cadre at a health centre facility was calculated as
$${\text{WISN = }}\left( {{\text{HSA }} \times {\text{ CAF}}} \right){\text{ + IAF }}$$
(4)
The raw values for facility-specific WISN-based requirements for cadres were rounded to integers as per WISN user’s manual [28].
We excluded facilities that resulted in null values (WISN = 0). Given that IAF forms a significant proportion for nurses’ workload, data points with null values for this component were excluded for nurses at PHCs and CHCs. We assumed a standard workweek to be 48 h (8 h × 6 days) and considered that some facilities might operate on a partial basis. Facilities with < 24 average working hours per week that did not seem to reach half-the-standard workweek were excluded. Hence, facilitywise WISN values were calculated for 8 centre-cadre combinations mentioned above.
Modelling nationally representative average for WISN-based requirements
To explore data heterogeneity, facility-specific raw (unrounded) WISN values for all cadres were assessed for across-state differences using non-parametric Kruskal–Wallis one-way ANOVA (analysis of variance). We used raw values for better ANOVA model fit as count data generated by WISN rounding scheme created saturation issues. Non-parametric tests were chosen due to observed skewness in data. To create WISN-based cadre requirement values that could be suitably used for national-level planning, generalised estimation equations (GEE) [29]. GEE estimates population-averaged responses and is robust to covariance mis-specification. Since we used data collected over years from facilities clustered within states to create nationally relevant WISN-based requirement thresholds, we used GEE to control the effects of these variables, i.e., estimates averaged over states and years. Here, the log-link Poisson model permitted the use of rounded WISN values as count outcome with state and year as categorical predictors. For each centre-cadre combination (e.g., PHC-doctors), three models with different working error correlation structures (independence, exchangeable, and auto-regressive order-1) were run. The model with the lowest quasi information criterion (QIC) value was chosen to represent the data best. Predicted marginal means and 95% asymptotic confidence intervals for the best-fit model gave WISN-based requirement values to represent average per-centre estimates for India, accounting for the influence of individual states and years:
$${\text{log}}\left( {{\text{WISN}}_{ij} } \right) = \beta_{0} + \beta_{1} {\text{State}} + \beta_{2} {\text{Year }}$$
(5)
where i = health-centre facility ID, j = measurement instance.
National and state-level WISN projections
WISN ratios, per-centre and overall WISN differences were calculated for states and all India as follows:
$${\text{WISN ratio}} = \frac{P}{{{\text{WISN}} \times N}}$$
(6)
$${\text{WISN difference}} \left( {\text{per - centre}} \right) = \left( \frac{P}{N} \right) - {\text{WISN}}$$
(7)
$${\text{WISN difference}} \left( {{\text{overall}}} \right) = P - \left( {{\text{WISN}} \times N} \right)$$
(8)
where ‘WISN’ stands for the nationally representative modelled average WISN-based requirement threshold for a centre-cadre combination, ‘P’ stands for the actual total number of staff of the cadre present at the given centre (PHC and CHC) at state and national levels, and ‘N’ represents the number of functional centres of the type (PHCs and CHCs) at the state and national levels from RHS-2019. The interpretation of the values was as per the WISN user’s manual [28]. WISN difference depicted workforce problem, categorised as balance, surplus and shortage based on values = 0, > 0, and < 0, respectively. WISN ratio implied workload pressure, with values = 1 and > 1 indicating normal pressure and no pressure, respectively. For ratios < 1, we created arbitrary categories for WISN ratio for the current study as follows:
0–0.25 = very high, 0.25–0.50 = high, 0.50–0.75 = medium, and 0.75–1 = low. The WISN ratios are categorized into 6 groups (0–0.25, 0.25–0.50, 0.50–0.75, 0.75–1, 1, > 1) and are interpreted together with WISN differences to determine the workload pressure.
To assess the association of workload pressure across states for HRH cadres at a given centre type, we calculated nonparametric Spearman’s rank correlations (ρ). We chose Spearman’s correlations as they are robust to linearity and normality assumptions and biases due to outliers and small samples [30]. For PHCs, a bivariate correlation was calculated between doctors and nurses across states. For CHCs, we calculated partial correlations among the 6 HRH cadres to determine workload pressure co-occurrence between specific cadre pairs while controlling for other interactions.
Comparison of WISN-based requirement with current sanctioning
Two analyses were conducted to investigate suboptimal sanctioning. First, we calculated:
$${\text{Sanctioning difference}} \left( {\text{per - centre}} \right) = \left( \frac{S}{N} \right) - {\text{WISN}}$$
(9)
$${\text{Sanctioning difference }}\left( {{\text{overall}}} \right) = S - \left( {{\text{WISN}} \times N} \right)$$
(10)
where ‘WISN’ and ‘N’ stand for values as described above, while ‘S’ stands for the total number of sanctioned posts of a cadre at the given centre type (PHC and CHC) at the state and national levels from RHS-2019. Sanctioning differences depict HRH misallocation with values > 0 indicating over-sanctioning, < 0 indicating under-sanctioning, and = 0 indicating optimal sanctioning. Second, we checked the concordance (i.e., agreement) between the sanctioned posts under the current norm (S) and WISN-based requirements (WISN*N, as given above) across states using Lin’s concordance correlation coefficient (RC) [31]. Coefficient values of −1, 0, and + 1 depict perfect disagreement, no agreement, and perfect agreement, respectively. Values < 0.90 depict poor agreement [32]. We also calculated the bias correction factor that measures the deviation from 45° line (perfect concordance), with 1 showing no deviation.
General statistical and packages details
Statistical significance for hypothesis tests (ANOVA and correlations) was set at the conventional threshold of 0.05, i.e., p values < 0.05, were considered significant. Analyses were conducted in open-source R (Version 4.0.2) [33] and R-Studio (Version 1.3.1056) (https://rstudio.com/) using validated packages [34,35,36,37,38,39,40,41,42]. We provide the analysis scripts (Additional file 5A–D), generated data (Additional file 6A–B) and data dictionary for RHS-based calculations (Additional file 6C). These files can also be viewed on https://github.com/asarforindia/RHS-WISN.