An econometric model for hourly wage is developed, and explanatory variables customary to studies of this nature are included in the log-lin model.
Hourly wage is determined in Equation (1) as:
(1)
where lnWi is the log of hourly wage, Gender is a dummy variable with the value 1 for male, UO
i
becomes unpaid overtime and X
ij
becomes a vector of j variables known to impact on wage differentials that includes the level of education, occupation, experience, experience squared, marital status, number of children, number supervised, and private/public sector.
Initially, ordinary least-squares regression is used to identify the gender wage gap when unpaid overtime is included and then excluded from the model. As displayed in Equation (1), log of hourly wage is regressed against several variables characterizing individual and work characteristics. The gender dummy variable measures the gender differential. Later we stratify the sample by gender to allow a comparison between male and female incomes on the impact of unpaid overtime. Additional analysis estimates the hourly wage separately for each occupational group.
Lastly, the Blinder–Oaxaca decomposition [21, 22] is employed to decompose the gap in outcomes between males and females. The gender pay gap in this log-lin method arises from two functions. The first is the actual difference in variables as shown by the difference in mean values between males and females – a part that is explained by group differences in productivity characteristics such as education, work experience and unpaid overtime. The second function is the discrimination function or unexplained residual – a part that cannot be accounted for by differences in characteristics [28]. This unexplained component is traditionally interpreted as a measure of discrimination. The expectation is that similar endowments should translate into similar wage levels among females and males. Any differences in male and female earnings from this function arise from the difference in male and female coefficients [29].
The difference in the gender wage gap (that is, Blinder–Oaxaca decomposition) arises from the following equation:
(2)
where ln(W) is the log of hourly wages, x
ij
M is the vector of means from the male equation, x
ij
F is the vector of means for the female equation, β
ij
M is the vector of coefficients from the male equation, and β
ij
F is the vector of coefficients from the female equation.
A cross-sectional approach is used to investigate gender–wage differentials. Although cross-sectional analysis ignores the effects of institutional and technological change and changes in the labour market over time [30], it does permit the inclusion of certain variables, such as human capital, and allows an examination of the wage distribution across individuals [31]. Furthermore, the persistent nature of the gender wage gap shows that cross-sectional analysis is not as biased as it would be in other applications, since the wage gap shows little variation overtime [32].
The sample
The data presented are subgroups of a larger study, the Work Outcomes Research Cost-benefit Project [33]. The information in this paper was collected from the Health and Performance at Work Questionnaire (HPQ) developed by the World Health Organization. Information about the HPQ can be accessedonline [34]. Employees over the age of 18 years were invited to respond to the HPQ. Participation in the survey was voluntary and confidential. The University of Queensland Human Research Ethics Committee approved the study protocol. The survey derived from the Australian workforce during 2005 and 2006 had a response rate of 25%.
The data used for the analysis were confined to Queensland employees working in the health sector, aged 25 to 64 years. In this paper the health sector includes those employed in the community sector who assist health professionals in the provision of patient care. This study follows the practice of most Australian studies of gender wage differentials in Australia by focusing on full-time workers [1, 8–11, 24]. Confining the analysis to workers in the health sector captured the award agreements of the state of Queensland and the industry. Isolating the sample to one industry in one Australian state also reduced the complexities associated with heterogeneous institutional factors and labour market forces experienced among various industries and Australian states.
Those aged 65 and over were excluded from the analysis because the minimum pension age for males is 65 years in Australia. Persons under 25 years old were also excluded because many had not yet completed their tertiary studies, which, by inclusion, would add greater uncertainty and heterogeneity to the sample. That is, no information was collected on the level and type of studies undertaken – if any – by this group.
After excluding those who did not fit this study’s criteria, 10,066 observations remained for analysis. A comparison of the Work Outcomes Research Cost-benefit dataset with the Australian Bureau of Statistics census data of 2005 [35] showed similar demographics for employees aged 25 to 64. Discrepancies in the percentage of full-time employees in the occupation categories between the two surveys are probably due to the slight differences in the categorization of workers.a
There are two potential sources of selection bias in the chosen sample. First, there is the decision to enter the paid labour market. Wages are only observed for people who are participating in the labour force and this might be a selective group. Second, there is the decision to work full-time or part-time conditional upon labour market entry. Several studies report that the correction for selection bias in the analysis of gender wage differentials, such as the Heckman procedure, has produced conflicting results and the application of this methodology may indeed introduce more bias (for a review of the literature, see [9]). Similar to Eastough and Miller [9], this study does not correct for selectivity bias that would require a more complex selection mechanism than the single selection mechanism.
The data
The hourly wage rate was constructed by dividing the annual income by the hours employees were expected to work in a typical 7-day period, divided by 52 weeks. The variable was then converted to its logarithm. This produced a level of skewness and kurtosis within the acceptable range of a normal distribution.
The construction of the unpaid overtime variable involved several steps. The HPQ survey asked employees: ‘About how many hours altogether did you work in the past seven days?’. This information gave the actual hours worked over the week. Respondents were also asked: ’How many hours does your employer expect you to work in a typical seven-day week? (If it varies, estimate the average.)’. This information determined the expected hours worked.
Unpaid overtime was calculated by the actual minus expected hours worked per week.b A positive (negative) sign indicated that the employee worked hours above (below) their employer’s expectation. If an employee was expected by their employer to perform a certain amount of overtime, then actual and expected hours would equal each other.
This indirect measure of unpaid overtime does have an advantage over directly asking employees to report the amount. Some employees may claim to work unpaid overtime even though their contracts do not specify the length of working hours. This is typical of managerial and professional occupations. In these cases it is more difficult to ascertain how and why it is believed that work has been undertaken for no pay [36]. The indirect estimate of unpaid overtime is useful when investigating a range of occupational groups. Hours worked beyond the expected level probably do not attract pay, even though there is no direct evidence of this. In contrast, supposing that an employee works ‘x’ hours of overtime and their employer expects them to work ‘x’ hours of overtime, then one would expect this to be reflected in the respondent’s annual income.
The data did not include information on actual labour market experience. In the absence of such information, the traditional approach is to use the Mincer proxy for potential labour market experience (that is, experience proxy, PE) calculated as age minus number of years of education minus 6.c The derivation of this variable required a number of intermediate steps. To calculate the number of years of full-time equivalent education, it was assumed that each post-secondary qualification lasted a specific length of time [37]. Similar to other studies, PE and PE2 were also included in the model [9, 32, 37]. PE2 captured the effect of labour market experience on income. Following the usual practice (for example, see [9, 32]), the analysis included the potential experience and added a children status variable to capture the effect of child-rearing on females’ labour force experience.d