Homework self-regulation strategies: a gender and educational-level invariance analysis

This study investigates the measurement invariance as a function of gender and educational level of the Homework Behavior Questionnaire (Ktpc), an instrument developed to assess students’ homework self-regulation strategies. A sample of 1400 elementary and middle school students was used. Results of confirmatory factor analysis indicated a good fit of the theoretical model composed of three dimensions: planning, execution and evaluation of the homework completion. The results also provided evidence for the existence of metric invariance and partial scalar measurement invariance across boys and girls and across the elementary school and the middle school students. The reliability of the scores in the three dimensions was high. Girls obtained higher scores than boys in planning, execution and evaluation. Middle school students had lower scores in planning compared to the elementary school students. These findings are discussed, and their implications for practice are highlighted.


Background
Homework includes the set of prescribed tasks to students by teachers to be held outside school hours. Tutoring, preparation for tests and examinations, supervised study in school, correspondence study courses at home and extra-curricular activities such as sport, as well as study activities self-initiated by students, cannot be considered homework (Cooper, 2001). The positive effects of homework completion on students' academic achievement across several subject matters have been demonstrated in a large number of studies (see, for example, the meta-analyses by Cooper, Robinson, &Patall, 2006, andFan, Xu, Cai, He, &Fan, 2017).
Homework not only contributes to academic performance, at a general or specific level (i.e. math/science), but has also been associated with the students' self-regulation abilities. The relationship between the two variables can be understood in the light of the demands that students must deal with when doing their homework. The accomplishment of homework includes a sequential set of three phases. The first comprises the prescription of the work done by the teacher and takes place in the classroom, the second takes place outside the classroom and consists of the execution of the prescribed tasks, and the last phase occurs upon return of the student to the classroom, after the work is done (Cooper, 1989(Cooper, , 2001Coulter, 1979;Rademacher, 2000). In the second stage, students are responsible for doing the work prescribed by their teachers and managing time, spaces and environments, and even for seeking help whenever needed (Cooper, 2001;Corno, 2000;Trautwein & Koller, 2003). They should also accomplish the tasks in time, control the possible internal and external distractors, decide which aids they will use and check if all the prescribed tasks are complete (Epstein & Van Voorhis, 2001). Therefore, a successful homework completion demands self-regulation abilities, and some instruments that assess the selfregulatory components that are present during the execution of homework have been developed.
One of these instruments is the Homework Management Scale (Xu, 2008;Yang & Xu, 2015), which is a self-report measure for high school students composed of 22 items distributed by five subscales: arranging environment, managing time, handling distraction, monitoring motivation and controlling emotion. Studies with American (Xu, 2008) and Chinese (Yang & Xu, 2015) samples of 11th graders demonstrated adequate reliability. In both samples, confirmatory factor analysis results provided empirical evidence for the five-factor structure. Although invariance of this structure was demonstrated across calibration and validation samples (Xu, 2008), no invariance studies were conducted considering variables that play a role on self-regulation, such as gender. Studies focusing on gender differences in self-regulated learning found that girls are more self-regulated than boys (Xu & Corno, 2006), spend more time doing homework (Rosário, Mourão, Núñez, González-Pienda, & Valle, 2006), do a better behaviour regulation (Weis, Heikamp, & Trommsdorff, 2013) and take more initiative to manage their homework (Xu & Wu, 2013).
Another instrument that includes an assessment of the homework self-regulation components is the Self-Assessment Questionnaire: Homework (SAQ; Hong, Peng, & Rowell, 2009). The SAQ is composed of 34 items that measure homework utility value and intrinsic value (dimensions related with the task value), effort and persistence (dimensions related with the motivational outcome) and the planning and self-checking applied during the homework process (dimensions related with the metacognitive strategy use). Hong, Peng, and Rowell (2009) administered the SAQ to groups of 7th and 11th graders and found grade-level differences in the scores of the six SAQ dimensions, with the second group obtaining significantly lower scores than the first. They concluded, as a consequence, that "older Chinese students perceived homework as less useful, enjoyed doing homework less, expended less effort, persisted less, and engaged in planning and selfchecking less than did younger students" (Hong et al., 2009, p. 274). In the same study, no gender differences were found. Nonetheless, no evidence for the grade and gender measurement invariance of the SAQ was provided, which is essential to guarantee the validity of these findings.
The Homework Distraction Scale (HDS; Xu, 2015) assesses one specific aspect of self-regulation in homework completion: the (in)ability to suppress distractors and maintain the attention in the homework task. The HDS is composed of six items that the students must rate using a Likert-type scale and that are organized into two dimensions: (a) conventional distraction (e.g. Start conversations unrelated to what I'm doing) and (b) tech-related distraction (e.g. Stop math homework to play online games or video games). Xu, Fan, and Du (2015) tested the two-factor structure and its measurement invariance across gender using a sample of 796 Chinese 8th graders. They found evidence for the existence of metric invariance between boys and girls. However, given that scalar invariance was not tested, no mean comparisons between both gender groups were performed.
As a summary, these instruments measure distinct self-regulation abilities during homework completion and have been developed to assess middle and high school students, probably mirroring the fact that most of the research on homework completion and its related variables is conducted at these stages (e.g. Iflazoglu & Hong, 2012;Lau, Kitsantas, & Miller, 2015;Lee, Lee, & Bong, 2014;Núñez et al., 2015;Regueiro, Suárez, Valle, Núñez, & Rosário, 2015;Valle et al., 2016;Xu, 2011;Xu & Wu, 2013;Yang & Xu, 2015). The development of scales that evaluate self-regulation of homework behaviours in younger children remains an important research issue. Moreover, all of the reviewed instruments are self-report measures and, as Xu (2008) indicates, "there is a need to incorporate other measures of homework behaviors over time (e.g., student's homework behaviors as recorded and perceived by their teachers and their parents) to complement students' self-reports" (p. 320) in order to have a more complete and reliable understanding of this issue.
The Homework Behavior Questionnaire (Ktpc) was developed to assess self-regulation abilities in homework completion but focus homework as a sequential process that involves self-regulatory skills. Homework models (Cooper, 2001;Corno, 2000;Coulter, 1979;Rademacher, 2000), as well as Zimmerman's (2000) cyclical model of self-regulated learning, were used to guide the development of the Ktpc. The questionnaire is centred in the homework's second phase, which corresponds to assignment execution, which usually occurs at home or in community contexts (Cooper, 2001;Coulter, 1979;Rademacher, 2000). Therefore, the Ktpc was developed in order to measure the processes, beliefs and behaviours that tend to occur during three steps: homework planning, execution and evaluation. Moreover, the questionnaire was developed to assess the behaviours of students from different educational levels-elementary school (grades 1-4) and the first cycle of middle school (grades 5-6) 1 -based on the information provided by the students' parents or other tutors. Given that the factor structure of the Ktpc was not previously tested using confirmatory factor analysis, the first two goals of this study were to test the fit of the theoretical model to the data and to investigate the reliability of the scores obtained in the Ktpc.
As was previously referred, research has found consistent differences between girls and boys in the self-regulation abilities. Therefore, it is crucial to check the measurement invariance of any instrument that focuses on these abilities so that meaningful comparisons can be performed between male and female students. In the studies of the previous referred instruments that measure homework self-regulation components (Hong et al., 2009;Xu, 2008;Xu, 2015;Yang & Xu, 2015), this was not examined. Similarly, given that research found that older students are less self-regulated during homework completion (Hong et al., 2009) and that Ktpc was developed to assess children who attend two different educational levels, measurement invariance across these levels must also be guaranteed. Therefore, the third goal of this study was to investigate the measurement invariance of the Ktpc as a function of gender and educational level. Only after guaranteeing measurement invariance, meaningful group comparisons using the Ktpc can be performed.
Hence, the research questions of this study were as follows: (a) Does a multidimensional structure composed of three factors-planning, execution and evaluation-fit the data obtained in the Ktpc? (b) Are the scores of the Ktpc reliable? and (c) Is the factor structure invariant between boys and girls and between students from elementary and middle schools?

Participants and procedure
The sample was recruited by convenience, using a snowballing sampling technique for data collection. Formal authorizations from the board of the schools were collected prior to the questionnaire administration. The boards of seven public schools, located in the district of Porto (Portugal), were contacted and agreed to participate. These schools had a total of 1014 students from grades 1-4 and 611 students from grades 5-6. An informed consent form was distributed to the parents of these students informing them of the objectives of the study and asking for their collaboration, along with the questionnaires. This procedure was performed with the collaboration of the Psychology Services of the schools in which the study was conducted. The questionnaires were answered at home by the parents. The questionnaires were anonymous and were delivered and returned using closed envelops. The response rate was 86.15%. Therefore, the information regarding the behaviour during homework of 1400 students from Portuguese elementary (grades 1 to 4) and middle (grades 5 to 6) schools was collected. Table 1 displays the number of students in each grade and by gender. The students were equally distributed by the six grades, and the number of boys and girls was equivalent in all grade levels, χ 2 (5) = 9.046, p = .107. However, given that the number of grade levels in elementary school was higher, the number of students from the middle school grades was substantially lower (n = 557) than the number of students from the elementary school grades (n = 841).

Measures
The Ktpc is composed of three subscales: planning (six items), execution (seven items) and evaluation (eight items). The first subscale includes goal-setting/planning and is related to self-management or structuring either one's selfprocesses (i.e. behaviours, thoughts, emotions) or the social environment, before engaging in homework assignments. The second consists of behaviours that tend to emerge during homework execution, namely those related to selfreinforcement, persistence and seeking support. The third includes behaviours that tend to emerge after the assignments are finished, such as self-evaluation, self-correction, homework revision and seeking homework feedback.
Each item consists of a statement, and the parents or tutors must rate the frequency of specific children's homework behaviours, using a 5-point Likert scale response format (0 = never, 1 = rarely, 2 = sometimes, 3 = often, 4 = always). The items are presented in the Appendix.

Statistical analyses
Analyses were conducted with Mplus, version 7 (Muthén & Muthén, 2012). Because the distribution of the variables was non-normal, the maximum likelihood estimation with robust standard errors (MLR) was used. Confirmatory factor analysis (CFA) was used to test the fit of the threefactor model in each of the four groups: male students, female students, elementary school students and middle No information 0 (0%) 1 (0.1%) 1 (7.7%) -M mean, SD standard deviation school students. To assess the global fit of the tested models, the following criteria were used: the chi-square (χ 2 ) values, the ratio between the chi-square and the degrees of freedom (χ 2 /df), the comparative fit index (CFI), the root mean square error of approximation (RMSEA) and the standardized root mean square residual (SRMR). Model fit was considered acceptable when χ 2 /df was lower than 3.00, CFI values were higher than .90, RMSEA lower than .08 and SRMR lower than .10 (Schermelleh-Engel, Moosbrugger, & Müller, 2003). Composite reliability was calculated for each factor and values higher than .70 were considered adequate (George & Mallery, 2002;Hair, Black, Babin, & Anderson, 2009). After fitting the model separately, in a second step, multigroup CFA was performed to test the invariance of the structure across genders and educational levels, following the guidelines indicated by van de Schoot, Lugtig, and Hox (2012) and Byrne (2012). First, a configural model (model 0), where loadings and intercepts were freely estimated, was tested. In model 1, metric invariance was tested, where the factor loadings were constrained but the intercepts were freely estimated. In model 2, scalar invariance was tested, where both loadings and intercepts were constrained to be equal across both samples. Evidence for the invariance of the model across both samples is achieved when the constraint of parameters performed in testing the subsequent models does not worsen the fit indices. To perform this comparison, the Satorra-Bentler scaled chi-square difference test was calculated (ΔSB − χ 2 ). The comparison index Bayesian information criteria (BIC) was also used: the model with the lowest value was considered to be the one that best represents the data. Moreover, two additional criteria were considered, as recommended by Cheung and Rensvold (2002) and Chen (2007): (a) the difference in CFI (ΔCFI) that should be equal or lower than .01 and (2) the difference in RMSEA (ΔRMSEA) that should be equal or lower than .015. When full scalar invariance was not achieved, partial invariance was established by estimating freely the parameters identified after examining the Lagrange multiplier tests. After establishing the invariance of the factor structure, differences in the latent means between the gender groups and the educational-level groups were calculated.

Results
Table 2 presents the model fit for each gender and educational-level group. The three-factor model had an acceptable fit in all gender and educational-level groups, as indicated by the CFI, RMSEA and SRMR values, although the χ 2 /df slightly exceeded the reference values. Figures 1 and 2 show the factor loadings for the items in each gender group, and Figs. 3 and 4 display the factor loadings in each educational level. As can be seen in Figs. 1, 2, 3 and 4, all factor loadings were higher than .30. Table 3 presents descriptive statistics and the reliability testing results as a function of gender and educational levels. Composite reliability values were higher than .80 for all subscales in all four groups and were particularly high for the evaluation subscale. Table 4 shows the results of the invariance testing across gender and educational levels. Regarding the gender invariance testing, the results for the configural model (model 0) indicated an acceptable fit, although the χ 2 /df was out of the cut-off value of 3.00. The metric invariance model (model 1) fitted equally well, given that the ΔSB − χ 2 was nonsignificant, and the CFI and RMSEA differences did not exceed the reference values. The chi-square test of differences indicated that the scalar invariance model testing for boys and girls (model 2) fitted worse than model 1. Although the ΔRMSEA was lower than .015, the ΔCFI exceeded .01. The modification indices flagged the intercept of item 14, suggesting that it was not invariant. As a result, another model (model 3) was run to test partial scalar invariance. In this model, the identified intercept was unconstrained. Although the ΔSB − χ 2 was significant, the ΔCFI and ΔRMSEA were lower than the reference values (see Table 4). Moreover, model 3 had the lowest BIC of all models. Taken together, these results indicate that model 3 did not fit worse than model 1, and provide evidence for the existence of partial strong factorial invariance across boys and girls. The comparison of the latent means indicated that girls obtained higher scores than boys in all three dimensions: planning (ΔM = .45, p < .001), execution (ΔM = .18, p < .01) and

Educational-level invariance
Regarding the educational-level invariance testing, the results for the configural model (model 0) indicated an acceptable fit, with all indices (excepting χ 2 /df) being within the reference values. The chi-square test of differences indicated that the metric invariance model (model 1) fitted worse than the configural model. However, the ΔCFI and ΔRMSEA tests did not exceed the reference values and model 1 had a lower BIC than model 0, supporting the invariance of the factor loadings across students from both educational levels. The results for model 2 that tested scalar invariance indicated a poorer fit compared to the previous model: the chi-square test of differences was significant, the BIC value was higher than the one obtained for model 1 and the ΔCFI was higher than .01. The inspection of the modification indices led us to identify four intercepts (intercepts of items 12, 14, 16 and 18) that could improve the fit of the model if released, suggesting that these were not invariant. Consequently, a fourth model was run to test partial scalar invariance, where loadings and intercepts were constrained, excepting the four intercepts that were identified as non-invariant. Releasing these intercepts led to an improvement in the model fit (see Table 4). Therefore, partial measurement invariance across the elementary school and the middle school sample was established, and differences in the latent means between both groups were subsequently computed. When compared with the elementary school students, middle school students had lower results in planning (ΔM = −.17, p < .01). However, no differences between elementary and middle school students were found in execution (ΔM = .01, p = .92) and evaluation (ΔM = .04, p = .53). If the Fig. 1 Factor loadings for the three-factor model in the girls' group Fig. 2 Factor loadings for the three-factor model in the boys' group four items that had non-invariant intercepts were excluded (all from the execution dimension) in the comparison of latent means, the results were similar, as no differences between elementary and middle school students were found in homework execution strategies (ΔM = −.01, p = .93).

Discussion
The Ktpc is an instrument constructed to assess students' homework behaviour, as reported by parents or other tutors. Homework behaviour is related with what students do when dealing with homework, how they approach their work and how they manage their personal resources and homework settings . The first goal of this study was to test the fit of a model composed of three factors-planning, execution and evaluation-to the data obtained with the Ktpc in four groups: boys, girls, students from elementary school and students from middle school.
The results from confirmatory factor analysis offer support to the validity of the instrument, with an acceptable fit of the proposed theoretical model to the data in all four groups. The three dimensions of the model are theoretically defined as essential self-regulated processes in homework execution: homework planning, homework execution and homework evaluation. In essence, the Ktpc provides information about how frequently an individual uses self-regulated strategies to complete assignments. Thus, students who score high on this scale will often regulate their behaviours, thoughts and emotions to finish homework and to associate the execution of the assignments to school outcomes. The second goal was to explore the reliability of the results obtained in the Ktpc. Composite reliability results indicate that the scores in the Ktpc are highly reliable.
The third goal of this study was to investigate the measurement invariance of the three-dimensional structure  between boys and girls and between students from elementary and middle schools. The results of this study indicated that the three-dimensional structure is partially invariant between boys and girls and between students from elementary and middle schools, thus allowing meaningful comparisons across groups. Gender differences were found in all three self-regulation dimensions of the homework, and these differences were favourable to girls. These findings are consistent with the bulk of research which indicates that not only are girls usually more self-regulated but they also invest more time and effort and apply better selfregulation strategies in completing homework (Rosário et al., 2006;Weis et al., 2013;Xu & Corno, 2006;Xu & Wu, 2013). However, these are contrary to the findings of Hong et al. (2009) which found no gender differences in homework planning and self-checking as measured by the SAQ. The differences in the cultural settings (Portugal versus China) and the differences in the educational levels assessed can explain the divergence between the results of the present study and the ones obtained in the study by Hong et al. (2009).
Differences were also found between elementary and middle school students, but only in homework planning. Elementary school students had higher results in planning when compared with middle school students. This result may reflect the progressive decreasing in the involvement in the studying activities and in the positive attitudes towards homework as the students advance in the educational system (Rosário et al., 2005). Hong et al. (2009) had also found that students from the 11th grade planned homework less than the 7th grade students. Taken together, these and our results can indicate that homework planning decreases as the school grade increases and that this decrease starts at early stages, but this hypothesis must be investigated using a longitudinal design in the future.
Although the differences in the groups' latent means were similar, whether the non-invariant items were considered or excluded, future gender and educational-level comparisons of the scores obtained in the Ktpc should be interpreted cautiously, given that only partial measurement invariance was obtained.
All the items with non-invariant intercepts belonged to the execution subscale and had in common the fact that they are all related with the necessary self-control and ability to manage homework execution without the help of others, such as an adult (see the Appendix). Although research has shown that students tend to become gradually more autonomous in homework completion as they grow older, it has also indicated that students with higher academic achievement search more the adult supervision than students with lower academic achievement (for a review, see Corno & Mandinach, 2004). Additionally, the benefits of family help during homework completion are unclear. The results of some studies (e.g. Xu, 2007;Xu, Du, & Fan, 2016) indicated that family help was positively associated with the development of homework management abilities, but other studies (e.g. Silinskas, Kiuru, Aunola, Lerkkanen, & Nurmi, 2015) showed that excessive help, especially when children are perceived by their mothers as not very autonomous, led to poorer academic performance. Consequently, future studies should explore the pertinence of maintaining these items in the Ktpc.

Conclusions
Although literature on self-regulation and homework is growing, fewer studies focus specifically on the metric properties of questionnaires of homework self-regulation behaviours, and as far as we know, no instruments have been developed to assess these behaviours in children from the initial school grades. The Ktpc is a step taken towards that goal and presents itself as a promising and reliable measuring instrument focused on the process of homework completion. This scale may also help teachers to develop interventions to foster self-regulated learning through homework improvement. Future research should focus on gathering other types of validity evidence for the Ktpc, such as the one based on the relationship to other variables. It would be particularly important to study the relationship of the scores in homework planning, execution and selfevaluation with the academic achievement across different school subjects and specific subject matters.
Endnotes 1 In Portugal, elementary school, also known as the first cycle of basic education, comprises grades 1 to 4 (children aged between 6 and 10 years old). Middle school comprises the second cycle of basic education (grades 5 to 6; children aged between 10 and 12 years old) and the third cycle of basic education (grades 7 to 9; children aged between 12 and 15 years old).   When he/she prepares himself/ herself to perform homework, he/she tries to find a good space to work.

Planning
Note: Items with an asterisk (*) should be reverse coded