Skip to main content

Pieces of evidences of reliability of the Brazilian version of the Child Executive Functions Battery (CEF-B)


Executive dysfunctions are central symptoms in different neurological, developmental, and context-related conditions. The assessment of these functions is then essential in neuropsychological pediatric clinical practice. Given the need for reliable and valid evaluation batteries for clinical practice in Brazil, this study aimed to present the pieces of evidences of reliability of the Child Executive Functions Battery (CEF-B). A total of 230 Brazilian children with typical development aged between 7 and 12 years participated in the study. Internal consistency was determined by the split-half method, Cronbach’s α, and Ω. In addition, measurements of test-retest reliability and intraclass coefficient were also performed. Retest indicators were mostly weak and moderate (between .43 and .75). Overvall, coefficients show a satisfactory internal consistency reliability for planning and inhibition measures (between .72 and .92). Considering the measures of WM, results were also satisfactory for both α and Ω indexes. This study revealed that the CEF-B has satisfactory internal consistency reliability coefficients. However, several tests have shown low reliability assessed through the test-retest method. In general, findings reveal interesting pieces of initial evidence of reliability of the Brazilian version. The methodological approach could be improved in future studies by including children with executive disorders.


The scientific advancement of child neuropsychology allowed the identification of numerous contexts of brain vulnerability that represent risks of cognitive and behavioral disorders during childhood. Acquired brain injuries, congenital, neurodevelopmental, and neuropsychiatric disorders constitute clinical conditions with potential risk for early neuropsychological dysfunctions. In addition, social and cultural conditions such as maternal nutrition, abuse of alcohol, and other drugs by the mother during pregnancy, inadequate living conditions, physical violence, and sexual abuse, among others, represent potential risk conditions for developmental dysfunctions. These clinical settings greatly impact executive functions (EF) as central symptoms (Craig et al. 2016; Evinç et al. 2018; Lonergan et al. 2019; Mauger et al. 2018; Zelazo 2020). In fact, the early and prolonged physiological maturation of the prefrontal circuits involved in the development of EF imposes a substantial vulnerability to these high-level skills (Dennis 2006). Since these functions are essential to behavioral control and regulation skills, their efficient functioning provides a fundamental basis for psychological development, including cognition, emotions, and social interactions (Diamond 2013). Thus, the identification of early changes on EF in the pediatric population constitutes a substantial clinical and scientific issue.

Assessment of EF in children

Performance-based tests are still the most usual method for the assessment of EF in children. They provide a standardized and structured evaluation framework that is relatively objective and easy to operate. Over the last 20 years, numerous performance-based tests have been developed or adapted around the world. However, the influence of historical, social, and cultural factors on the emergence of EF in children demands particular considerations when using tests. In Brazil, it is particularly important to consider these aspects because it is a country with a remarkable cultural variability and socioeconomic inequality (Piccolo et al. 2016).

A recent systematic review identified 37 executive measures used in Brazil in the pediatric context (Guerra et al. 2020a). Despite the great variety of tests found, only 13 are allowed to be used in clinical practice by the Federal Council of Psychology. In addition, only eight correspond to measures specially designed for the assessment of EF in children. The remaining five correspond to tests created for the assessment of EF in adults that had their sample expanded for the evaluation in children, without considering the dynamic aspects of executive development. Also, researches that considered at least the three basic executive components are scarce and, to date, no specific battery for assessing EF in Brazilian children is available (Guerra et al. 2020a).

Measurement errors in executive tasks represent another concern that is still often neglected in child evaluation (Van der Linden et al. 2000). The inevitable participation of more basic skills in executive tasks necessarily makes them impure and requires (i) the use of dissociate methods to differ basic skills from executive ones and (ii) to ensure the executive nature of the difficulties encountered by confronting various tasks requiring different non-executive processes (Denckla 1996). This inherent “noise” in executive measures is enhanced in child assessment since the non-executive processes are potentially under development and contribute to age-related variations (Roy et al. 2017). These variables highlight the importance of using appropriate and reliable executive measures for children to meet the objectives of neuropsychological assessment and reduce the potential risk of false positives and false negatives in clinical practice (Guerra et al. 2020a).

Child Executive Functions Battery

In order to overcome the aforementioned assessment challenges, the Child Executive Functions Battery (CEF-B) was created in France to overcome the scarcity of instruments adapted for the pediatric population (Roy et al. 2020). The CEF-B consists of a set of 12 performance-based tests aimed at children and adolescents between 6 and 16 years old. The battery is based on a child-centered theoretical model and assesses the main executive processes: inhibition, flexibility, working memory (WM), and planning (Diamond 2013). It comprises new experimental tasks and tests that already exist in the international literature but have been modified or expanded to better attend to the pediatric population. Each component (inhibition, WM, flexibility, and planning) is represented by three tests, which are assumed to preferably capture the corresponding dimension. However, this task affiliation is not exclusive because of the interdependent character of EF. One verbal test is proposed per component, while the others are predominantly nonverbal (and mixed, in the case of WM). This approach was conceived in order to cross-reference indicators and provide clinicians with appropriate tests in the case of communication, visuospatial, or gestural disorder (Roy 2015).

The design of a battery specially conceived for children aroused the interest to develop a larger cross-cultural project. Thus, a dynamic intercultural approach has been consolidated with several countries, including Brazil. Given the lack of EF test batteries based on child-centered theoretical models, the CEF-B was adapted to the Brazilian context (Guerra et al. 2020b). However, a test’s adaptation is only the first step to its implementation into a new culture (Borsa et al. 2012). A crucial point regarding the scientific approach of neuropsychological measures is associated with psychometric validity and reliability. These characteristics refer to the legitimacy of the interpretations provided by the test result and the empirical evidence regarding the correspondence between theoretical expectations and the measurement itself (Muniz 2004).

Regarding the CEF-B, preliminary evidence of validity of the French version has been published for studies with children with typical development (Roy et al. 2018), and with different clinical conditions, such as neurofibromatosis type 1—NF1 (Remigereau et al. 2018; Roy et al. 2010, 2014), frontal epilepsy (Charbonnier et al. 2011), and brain tumors (Roche et al. 2018). These initial data indicate a good sensitivity of the battery for the evaluation of EF in pediatric populations. Developmental validity evidences (age-related performance improvement) were found for inhibition tests (Stroop test; F(5, 108) = 10.42, p < .001). In addition, a good clinical sensitivity was observed through significant statistical differences between clinical and control groups for planning tasks (Rey Osterrieth Complex Figure; F (1, 69) = 6.889, p = .011—for the NF1 group and Z score = 2.89 for frontal epilepsy case) and for flexibility tests (Kids Card Sorting Test; p = <.001 for the NF1 group).

Concerning the Brazilian version, a significant improvement with age (developmental validity) was observed between 7- and 12-year-old children (Guerra et al. 2021). A 4-factor EF structure was also found through an exploratory factorial and correlation analysis that corroborate with the theoretical assumption considered in the CEF-B. The same study showed a sensitivity of CEF-B to identify the negative impact of low socioeconomic status on executive development, which agrees with the current literature (Farah 2017; Merz et al. 2019). This study on the trajectory and structure of the EF in the pediatric population of northeast Brazil presents initial evidence of validity which endorses the theoretical and methodological premises of the CEF-B (Guerra et al. 2021). In addition, convergent and divergent validity analyses were also carried out. The findings indicate correct external validity in relation to the three renowned executive tasks used, and good divergent validity compared to the non-executive measures (Guerra 2020). However, given the relevance of providing numerous indicators that attest the importance and utility of a test, we propose to evaluate complementary and different pieces of evidences of reliability of the CEF-B in Brazil.



A total of 230 Brazilian children with typical development aged between 7 and 12 years participated in the study. The sample was homogeneously distributed by age, gender, and type of school (Table 1). Participants were selected based on the following inclusion criteria: (a) signing of the informed consent form by parents or legal guardians; (b) regular registration in public or private school; (c) absence of a history of developmental, neurological, or psychiatric disorders; (d) absence of uncorrected sensory alterations; and (e) scaled score equal or higher than seven points in the WISC-IV Matrix Reasoning and Vocabulary sub-tests.

Table 1 Sociodemographic data


Table 2 presents a brief description of the 12 tests that compose the CEF-B (for a more detailed description of the tasks and variables used, see Guerra et al. 2021). The order of application of the tests that integrate the protocol was defined in a systematic and pseudo-random manner, alternating the investigated executive skills and their verbal/non-verbal nature. In order to limit measurement errors, the variables of the CEF-B were designed to modulate the executive load involved in some multi-composite tests. This approach consists in providing “control” conditions which are supposed to be less demanding on executive processes (i.e., subtracting the Trail A score from Trail B score to “isolate” the contribution of executive abilities in the Trail Making Test; Arbuthnott and Frank 2000).

Table 2 Description of CEF-B tests and scientific rationale for controlling measurement errors and methodological bias


The study was conducted in 14 public and private schools in Natal, Parnamirim, and Elói de Souza in the Rio Grande do Norte state. The project was submitted to and approved by the Research Ethics Committee of the Federal University of Rio Grande do Norte, under code 48383715.1.0000.5537. After the informed consent term was signed by legal guardians, children were evaluated using the vocabulary and matrix reasoning subtests in a single session lasting approximately 20 min at the school itself and during the regular school term.

The selection of participants was carried out in collaboration with the coordinators and teachers of each institution, who indicated the children to participate in the study. Coordinators and teachers were asked to indicate children with no suspicion or diagnosis of neurodevelopment disorders. We randomly selected the participants from the list of children indicated by school professionals which were also authorized by parents to participate in the survey. It should be noted that a questionnaire was completed by the parents when signing the consent form to ensure that the children who participated in the research did not present a history of developmental, neurological, or psychiatric disorders. A total of 264 signed consent forms were collected, and 244 children and adolescents were submitted to the WISC-IV subtests vocabulary and matrix reasoning. Fourteen of the participants presented weighted points below seven in one of the subtests and were therefore excluded from the sample because they did not meet one of the inclusion criteria.

All participants were individually evaluated in a quiet room in their school or home environment. The tests were administered by trained neuropsychologists using standardized instructions. The assessment of the children consisted of the application of the entire CEF-B, requiring two or three assessment sessions with a duration of approximately 30–40 min each, depending on the age of the child. The tests were systematically presented in the same order. The first session included: 8 Mazes, Stroop, Visuospatial updating, Scripts, and Tapping tasks. The second session contemplated the Rey Complex Figure, Trail Making Test, Dual task, Kid Card Sorting test, Cross-out Joe, Verbal updating test, and Frog test. In case an additional session was needed, four tests were presented per session in the aforementioned order.

The second phase comprised the application of the tests that were selected for the retest method. This step was carried out 4 to 6 weeks after the last assessment session of the child. One 40-min session was required to perform the 6 CEF-B tasks that were retested, which were administered by the same evaluator in the following order Stroop, Tapping, Kids Cards Sorting Test, TMT, Dual task, and Frog test.

Statistical analyses

The reliability of the CEF-B was verified by several methods. In fact, the study of the reliability of EF measurements is complex because analysis by internal consistency and split-half methods are in most cases not applicable. In addition, test-retests can affect the validity of the second measurement, since time measures can be associated with learning effects between the two sessions (Soveri et al. 2018). Finally, the examiner is sometimes an important source of measurement error, which means that reliability among examiners must also be calculated (Urbina 2007). For this reason, different indicators were used for the tasks that compose the CEF-B.

Retest was applied for tasks in which the time factor was central to the accomplishment of the task and when the use of another method was not applicable. It was applied for all flexibility measures, two inhibition measures (Stroop and Tapping) and one WM test (Dual task). For tests in which it was possible to use different methods of reliability other than retest, we prioritized the use of classical methods such as split-half (Spearman-Brown formula is used to correct the effect of splitting the number of items), internal consistency, and intraclass coefficients. In the cases of the Scripts, the 8 Mazes, and for Verbal and visuo-spatial updating tasks, two indicators of internal consistency (Cronbach’s alpha and omega—Cronbach 1951; McDonald 1985, 1999) were applied instead. Also, the split-half method (even and odd items) was used for the 8 Mazes test. Pearson’s correlation was calculated for both parts of the Cross-out Joe test. Since part B of this test corresponds to the mirrored version of part A, the purpose of this measure was to demonstrate the equivalence of these two steps. To this end, we calculated the correlation between A-B (A being applied first, followed by B), and we calculated the correlation between parts when they were applied in the opposite order (B-A). Regarding Rey’s figure, an intraclass coefficient was calculated for three indices: the copy score, the program score, and the planning index. For the calculation of this coefficient, four different examiners corrected the figures. All statistical analyses were performed using the “Psych” package (Revelle 2020) of the R software (R Core Team 2020). For all analyses, the significance level was set at .05.


Test-retest reliability

Table 3 summarizes, by domain and task, the reliability index obtained with the retest method. The coefficients observed are mostly low, but some are moderate. In fact, coefficients of reliability vary according to the type of measurement, for example for Tapping (Tapping Go/No-Go Timer = .18; while the others task indicators vary between .43 and .44), Dual task (Dual task Evolution clownsr = .23, while the Score Mu and Evolution digits vary from .56 and .57), and Frog test (Time = .43 and Score = .70). However, for other tests such as KCST (time = .59, categories = .60, perseveration = .75) and Stroop (Time = .53; Error = .50), the coefficients seem to be more homogeneous. Significant practice effects were found for all variables of two (KCST and Frog test) of the six tasks assessed by the retest method. Overall, this result revealed an improvement trend in children’s performance at the retest phase (Table 3).

Table 3 Reliability coefficients obtained through the test-retest method

Split-half method, internal consistency, and intraclass coefficients

The results obtained through the split-half method, internal consistency, and intraclass coefficients are described in Table 4. Coefficients show overall satisfactory reliability for planning and inhibition measures (between .72 and .92). Considering the measures of WM, results were also satisfactory for both alpha and omega indexes.

Table 4 Reliability indicators using the split-half method, internal consistency, and intraclass coefficients for planning, WM, and inhibition tests


The aim of the present study was to present the additional psychometric properties of the CEF-B in a sample of 7–12-year-old children from Northeastern Brazil. Results revealed initial adequate pieces of evidences of reliability for the CEF-B for the assessment of EF in Brazilian children.

Different alternatives are reported in the literature to evaluate reliability (Gregory 2010). It is currently suggested to use several methods that provide pieces of evidences of the test’s reliability, similar to what is observed for validity. In fact, it should be noted that reliability is a characteristic of the test scores and not the test itself. In this sense, these various methods and the choice of variables produce complementary but sometimes contradictory estimates. In executive tasks, the study of the reliability of measurements is complex because test-retests can affect the validity of the second measurement (Soveri et al. 2018). Also, analysis by internal consistency and split-half methods are in most cases not applicable. Finally, the examiner is sometimes an important source of measurement error, which means that reliability among examiners must also be calculated (Urbina 2007).

To examine the pieces of evidences of reliability in the Brazilian CEF-B version, we used different methods according to the nature of each executive test. For tasks where the time factor was central to the accomplishment of the task (half of the tasks), the retest was applied. In general, retest indicators were mostly weak and moderate. These results indicate a fluctuating retest stability depending on the measurement used. Such variability can be explained either by the different strategies used by the child in test and retest situations or by learning effects. The lowest values concern the Tapping task (r = .18), for which the results seem to be very dependent on testing conditions. In fact, this finding could be related to examiner bias, since the delay between the presentation of the stimulus and the children’s response can vary between examiners. This task would be more susceptible to this type of variation given the need to react according to the child's behavior. The reaction time of the child may condition the time of presentation of the stimulus by the examiner. A solution for this issue could be the use of a pre-defined delay, which could be achieved by converting the test to a digital version.

Another task that also seems to have problematic evidence concerns the Dual task (Evolution clowns variable; r=.23). Results observed on the retest may reflect changes in strategies (prioritizing the motor task over the cognitive task and vice versa) that lead to indirect effects on reliability scores. The same reasoning is valid for the TMT index. It is important to note that this variation has also been observed in previous studies on the assessment of EF. In fact, when several EF tasks are administrated, performance in these tasks is often poorly correlated and reliability rates are low (e.g., Lemay et al. 2004; Soveri et al. 2018; Willoughby et al. 2017). This weak reliability, most often associated with the test-retest situation, is usually explained by the fact that these EF tasks are susceptible to a practical effect that partially distorts this assessment.

For tests in which it was possible to use different methods of reliability other than test-retest, we prioritized the use of classical methods such as split-half, internal consistency, and intraclass coefficients. Regarding the reliability assessed through Cronbach’s alpha and omega coefficients, satisfactory indicators were found for Scripts, 8 Mazes, and the updating tasks (between .82 and .87). According to the Brazilian Federal Council of Psychology (Conselho Federal de Psicologia – CFP 2003; Primi et al. 2004), the minimal acceptable value for these indices is .60. In addition, some authors have suggested the following value classification: .80–.90, very good; .70–.80, respectable; .65–.70, acceptable; .60–.65 undesirable; and below .60, unacceptable (Freire and Almeida 2001). It is also important to note that the split-half method indicators for the 8 Mazes tests were also high, showing a good internal consistency of the task. The coefficients for Cross-out Joe were also acceptable, both under the A-B application order (r=.77), as well as under the reverse order (B-A; r=.71). This result also indicates an adequate internal consistency of the task.

Regarding the ROCF, reliability issues are mostly associated with the method of task correction. In fact, the correction of the task is often considered as subjective. In the CEF-B version of the test, we adopted more rigorous and objective instructions for the evaluation of the precision and location of the figure elements drawn by the children. Thus, the agreement between examiners was measured by the intraclass coefficient for the three measures of the test. Results showed a good consensus (.86 to .92) and indicate that a more rigorous and objective correction of the test may lead to more stable scores between examiners.

It should be mentioned that in addition to the findings regarding reliability evidences, the study by Guerra et al. (2021) allowed to expand the CEF-B adaptation process (Guerra et al. 2020b) by presenting preliminary normative data and evidences of validity that favor the use of this version in the Brazilian context. Results revealed evidences of developmental validity and a factorial structure compatible with the theoretical proposition of the battery, revealing consistent evidence of construct validity. In addition, data regarding the negative impact of low SES on EF performances also corroborate with the literature and show a satisfactory sensitivity of the battery in identifying these potential differences (Guerra et al. 2021).

Although classical measures of validity and reliability are necessary to prove the scientific nature of the battery, only the proof of clinical utility regarding the dissociation of a deficit and the expected result is able to truly attest its relevance. Thus, one of the future objectives of the CEF-B project in Brazil is to provide clinical data in order to assess the clinical sensibility of the battery. This endeavor is currently being carried out in several research centers in Rio Grande do Norte.

The main limitation of the present study concerns the sample size and its generalization to the Brazilian context. In fact, Brazil’s population and its social and economic diversity require a wider sample in order to assure the representativeness of its cultural diversity (see Guerra et al. 2020a for a review). In particular, the sample used for the test-rest reliability should be extended in future studies. In addition, the lack of data from children with clinical conditions in the sample, which would lead to a higher variance in data, constitute important limitations to the potential for generalization of the obtained results.


This study revealed that the tests of the CEF-B presented satisfactory pieces of evidence of reliability assessed through split-half method, internal consistency, and intraclass coefficients. On the other hand, several tests have shown low reliability retrieved by the test-retest method. Although these findings reveal interesting pieces of initial evidence of reliability of the Brazilian version, the methodological approach could be refined in future studies in order to include children with executive disorders and to provide the assessment of the clinical sensitivity of the CEF-B.

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.


  1. Arbuthnott, K., & Frank, J. (2000). Trail making test, part B as a measure of executive control: Validation using a set-switching paradigm. Journal of Clinical and Experimental Neuropsychology, 22(4), 518–528.

    Article  Google Scholar 

  2. Borsa, J. C., Damásio, B. F., & Bandeira, D. R. (2012). Adaptação e validação de instrumentos psicológicos entre culturas: algumas considerações. Paidéia (Ribeirão Preto), 22(53), 423–432

    Article  Google Scholar 

  3. Charbonnier, V., Roy, A., Seegmuller, C., Gautier, A., & Le Gall, D. (2011). Etude d’un cas de syndrome dysexécutif à prédominance cognitive chez un enfant présentant une épilepsie frontale symptomatique. Revue de Neuropsychologie, Neurosciences Cognitives et Cliniques, 3(1), 11–22.

    Article  Google Scholar 

  4. Conselho Federal de Psicologia – CFP (2003). Resolução N° 002/2003. Brasília: CFP.

    Google Scholar 

  5. Craig, F., Margari, F., Legrottaglie, A. R., Palumbi, R., De Giambattista, C., & Margari, L. (2016). A review of executive function deficits in autism spectrum disorder and attention-deficit/hyperactivity disorder. Neuropsychiatric Disease and Ttreatment, 12, 1191–1202

    Google Scholar 

  6. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334

    Article  Google Scholar 

  7. Denckla, M. B. (1996). Research on executive function in a neurodevelopmental context: application of clinical measures. Developmental Neuropsychology, 12(1), 5–15

    Article  Google Scholar 

  8. Dennis, M. (2006). Prefrontal cortex: typical and atypical development. In J. Risberg, & J. Grafman (Eds.), The frontal lobes: Development, function and pathology, (pp. 128–162). New York: Cambridge University Press.

    Google Scholar 

  9. Diamond, A. (2013). Executive functions. Annual Review of Psychology, 64, 135–168

    Article  Google Scholar 

  10. Evinç, S. G., Pektaş, E., Foto-Özdemir, D., Yıldız, Y., Karaboncuk, Y., Bilginer-Gürbüz, B., … Sivri, H. S. (2018). Cognitive and behavioral impairment in mild hyperphenylalaninemia. Turkish Journal of Pediatrics, 60(6), 617–624

    Article  Google Scholar 

  11. Farah, M. J. (2017). The neuroscience of socioeconomic status: correlates, causes, and consequences. Neuron, 96(1), 56–71

    Article  Google Scholar 

  12. Freire, T. & Almeida, L. S. (2001). Escalas de avaliação: construção e validação. In E. Fernandes & L. S. Almeida (Eds), Métodos e técnicas de avaliação: contributos para a prática e investigação psicológicas. Braga: CEEP, Universidade do Minho.

  13. Gregory, R. J. (2010). Psychological testing: history, principles, and applications, (2nd ed., ). New-York: Allyn & Bacon.

    Google Scholar 

  14. Guerra, A. (2020). The Brazilian version of the child executive functions battery (CEF-B): psychometric properties and executive development profile of children from the Northeast of Brazil. Natal: Universidade Federal do Rio Grande do Norte.

    Google Scholar 

  15. Guerra, A., Hazin, I., Siebra, C., Rezende, M., Silvestre, I., Le Gall, D., & Roy, A. (2020a). Assessing executive functions in Brazilian children: a critical review of available tools. Applied Neuropsychology: Child

  16. Guerra, A., Guerra, Y., Silvestre, I., Rezende, M., Le Gall, D., Roy, A., & Hazin, I. (2020b). Cross-cultural adaptation of the Childhood Executive Functions Battery (CEF-B) for brazilian Portuguese. Avaliação Psicológica, 19(4)

  17. Guerra, A., Hazin, I., Guerra, Y., Roulin, J., Le Gall, D., & Roy, A. (2021). Developmental profile of executive functioning in school-age children from Northeast Brazil. Frontiers in Psychology.

  18. Lemay, S., Bédard, M. A., Rouleau, I., & Tremblay, P. L. G. (2004). Practice effect and test-retest reliability of attentional and executive tests in middle-aged to elderly subjects. Clinical Neuropsychologist, 18(2), 284–302

    Article  Google Scholar 

  19. Lonergan, A., Doyle, C., Cassidy, C., MacSweeney Mahon, S., Roche, R. A., Boran, L., & Bramham, J. (2019). A meta-analysis of executive functioning in dyslexia with consideration of the impact of comorbid ADHD. Journal of Cognitive Psychology, 31(7), 725–749

    Article  Google Scholar 

  20. Mauger, C., Lancelot, C., Roy, A., Coutant, R., Cantisano, N., & Le Gall, D. (2018). Executive functions in children and adolescents with turner syndrome: a systematic review and meta-analysis. Neuropsychology Review, 28(2), 188–215

    Article  Google Scholar 

  21. McDonald, R. P. (1985). Factor analysis and related methods. Hillsdale: Lawrence Erlbaum.

    Google Scholar 

  22. McDonald, R. P. (1999). Test theory: a unified treatment. Mahwah: Lawrence Erlbaum.

    Google Scholar 

  23. Merz, E. C., Wiltshire, C. A., & Noble, K. G. (2019). Socioeconomic inequality and the developing brain: spotlight on language and executive function. Child Development Perspectives, 13(1), 15–20

    Article  Google Scholar 

  24. Muniz, J. (2004). La validación de los tests. Metodología de las Ciencias del Comportamiento, 5, 121–141.

    Google Scholar 

  25. Piccolo, L. d. R., Arteche, A. X., Fonseca, R. P., Grassi-Oliveira, R., & Salles, J. F. (2016). Influence of family socioeconomic status on IQ, language, memory and executive functions of Brazilian children. Psicologia: Reflexão e Crítica, 29(1), 23

    Google Scholar 

  26. Primi, R., Nascimento, R. S. G. F., & Souza, A. S. (2004). Critérios para avaliação de testes psicológicos, Em Conselho Federal de Psicologia - CFP (Org.), Avaliação dos testes psicológicos: relatório (pp. 31–55). Brasília: CFP.

    Google Scholar 

  27. R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL

    Google Scholar 

  28. Remigereau, C., Roy, A., Costini, O., Barbarot, S., Bru, M., & Le Gall, D. (2018). Praxis skills and executive function in children with neurofibromatosis type 1. Applied Neuropsychology: Child, 7(3), 224–234

    Article  Google Scholar 

  29. Revelle W (2020). psych: procedures for psychological, psychometric, and personality research. Northwestern University, Evanston, Illinois. R package version 2.0.9.

    Google Scholar 

  30. Roche, J., Chevignard, M., Le Gall, D., Frappaz, D., Roulin, J. L., Fournet, N., & Roy, A. (2018). Exploration du fonctionnement exécutif chez les enfants et adolescents soignés pour une tumeur cérébrale. Approche Neuropsychologique des Apprentissages chez l’Enfant, 157, 731–741.

    Google Scholar 

  31. Roy, A., Roulin, J. L., Charbonnier, V., Allain, P., Fasotti, L., Barbarot, S., … Le Gall, D. (2010). Executive dysfunction in children with neurofibromatosis type 1: a study of action planning. Journal of the International Neuropsychological Society, 16, 1056–1063

    Article  Google Scholar 

  32. Roy, A., Barbarot, S., Roulin, J.-L., Charbonnier, V., Fasotti, L., Stalder, J.-F., et al. (2014). Is executive function specifically impaired in children with neurofibromatosis type 1? A neuropsychological investigation of cognitive flexibility. Applied Neuropsychology: Child 3, 94–102.

  33. Roy, A. (2015). Approche neuropsychologique des fonctions exécutives de l’enfant: état des lieux et éléments de prospective. Revue de Neuropsychologie, Neurosciences Cognitives et Cliniques, 7(4), 245–256

    Article  Google Scholar 

  34. Roy, A., Lodenos, V., Fournet, N., Le Gall, D., & Roulin, J. (2017). Le syndrome dysexécutif chez l’enfant : entre avancées scientifiques et questionnements. ANAE: Approche Neuropsychologique Des Apprentissages Chez l’Enfant. 29(146), 27–38.

  35. Roy, A., Kefi, M.-Z., Bellaj, T., Fournet, N., Le Gall, D., & Roulin, J.-L. (2018). The Stroop test: a developmental study in a French children sample aged 7 to 12 years. Psychologie Française, 63(2), 129–143

    Article  Google Scholar 

  36. Roy, A., Le Gall, D., Roulin, J.-L., and Fournet, N. (2020). Un nouveau dispositif d’évaluation des fonctions exécutives chez l’enfant: la batterie FÉE. ANAE 167, 393–402.

  37. Soveri, A., Lehtonen, M., Karlsson, L. C., Lukasik, K., Antfolk, J., & Laine, M. (2018). Test–retest reliability of five frequently used executive tasks in healthy adults. Applied Neuropsychology: Adult, 25(2), 155–165

    Article  Google Scholar 

  38. Urbina, S. (2007). Fundamentos da testagem psicológica. Porto Alegre: Artmed.

    Google Scholar 

  39. Van der Linden, M., Meulemens, T., Seron, X., Coyette, F., Andrès, P., & Prairial, C. (2000). L’évaluation des fonctions exécutives. In X. Seron, & M. Van der Linden (Eds.), Traité de neuropsychologie clinique (Tome I), (pp. 275–300). Marseille: Solal.

    Google Scholar 

  40. Willoughby, M. T., Kuhn, L. J., Blair, C. B., Samek, A., & List, J. A. (2017). The test–retest reliability of the latent construct of executive function depends on whether tasks are represented as formative or reflective indicators. Child Neuropsychology, 23(7), 822–837

    PubMed  Google Scholar 

  41. Zelazo, D. (2020). Executive function and psychopathology: a neurodevelopmental perspective. Annual Review of Clinical Psychology, 16, 431–454

    Article  Google Scholar 

Download references


The authors thank CAPES for first author scholarship and CNPq (423231/2016-2) for financing this project. We also thank Lydie Iralde and Phillipe Allain for providing the Scripts task and Nathalie Fournet for the working memory tasks. We are thankful to Isadora Silvestre, Marinna Rezende, Yasmin Guerra, Laís Maia, and Gláucia Vivana for participation in data collection. Finally, we would like to thank all the experts who participated in the protocol adaptation steps.


This research had the financial support of the Conselho Nacional de Desenvolvimento Científico e Tecnológico (423231/2016-2) and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior for the first author scholarship (1654284 and 88881.189733/2018-01).

Author information




AG collected the data, analyzed the results, and performed the writing of the manuscript. AR, IH, JL, and DL were responsible for the project, study design, general orientation of execution and preparation of the manuscript, and writing of the manuscript. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Amanda Guerra.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Guerra, A., Hazin, I., Roulin, JL. et al. Pieces of evidences of reliability of the Brazilian version of the Child Executive Functions Battery (CEF-B). Psicol. Refl. Crít. 34, 6 (2021).

Download citation


  • Executive functions
  • Child
  • Neuropsychological assessment