Evaluation of Instructional Technology: A Case Study of Early Childhood Teacher Candidates

With the rise in technology use in early childhood classrooms, there is need to explore the strategies used to evaluate the effectiveness of such technology. Research indicates the proliferation of unvetted technology tools on the market and in online open source formats. Meaningful technology evaluation needs to be completed before, during, and after implementation in the lesson. Using a qualitative research design, data were collected from teacher candidates’ post-practicum reflective essays and one-on-one interviews. Findings indicate that during practicum a few teacher candidates used sound pedagogical strategies to evaluate the appropriateness and effectiveness of instructional technology. Findings also reveal the need for cooperating teachers to have strong technology pedagogical strategies in order to help teacher candidates who may be struggling with technology integration and assessment strategies. Based on the findings it is recommended that early childhood teacher preparation and professional development programs address alternative ways to assess the effectiveness of instructional technology tools used in the classrooms. Abstract The purpose of this research, which was carried out for the first time in Greece, is to focus on the early detection of preschool children’s internalizing problems , according to their teachers’ perceptions. The participants, 77 preschool teachers of 77 half-day and all-day preschool classes from the thirteen regions of Greece, completed: (a) the “Caregiver-Teacher Report Form (C-TRF) for ages 1½-5” of Achenbach (Achenbach & Rescorla, 2009) and (b) the “Demographic Questionnaire” (Doni, 2015), considering 1.234 mixed gender (617 boys and 617 girls) children 4-6 years of age. According to the results, preschool teachers detected internalizing problems in 10.4 % of the children, of whom 6.9% was included in the clinical range, while 3.5% was included in the borderline range. The highest rate, 10.9 % of the children, was included in either clinical or borderline range for withdrawal syndrome. Boys had higher rates of internalizing problems than girls. Moreover in all-day preschools, preschool teachers detected more cases of children with emotional reactivity . These findings could be useful in future studies specialized on children’s social and emotional functioning, in a future revision of universities curricula associated with early childhood education, as well as in preschool teachers’ training programs, by including modules related to the accurate and early detection and treatment of internalizing problems experienced by preschoolers. Abstract The study presents the results of a pilot project in which computer games were used for teaching English as a foreign language to primary school students. The target group was sixty fifth-grade primary school students, divided into three groups. The first group was taught conventionally using the textbook. In the second, a contemporary teaching method was used, but the instruction was not technologically enhanced. The third group of students used the computer games without the teacher’s intervention. Data were collected by means of a questionnaire and evaluation sheets. The data analysis revealed that the learning outcomes from the use of games were – more or less – the same as in the other methods. Also, the attitude of students towards games was very positive. The results can be attributed to students’ enjoyment, motivation, and positive attitude towards the use of games as well as to the teaching method. The results also lead to the need to examine ways that would allow digital games to be even more effective in the teaching of English as a foreign language.


Introduction
Technology has transformed pedagogy in early childhood learning environments. Most early childhood classrooms today are equipped with some type of technology including but not limited to the following; smart boards, projectors, computers, and iPads. Fairly recent statistics indicate that more than 95% of teachers in the U.S. have access to computers and Internet in the classroom (NCES, 2010). In 2013, NCES reported that 71% of the U.S. population 3 years and over used the Internet. Inferring from the statistics, one can conclude that more than 90% of early childhood teachers (K-3 rd grade) use computer technology with young children today. Today's early childhood technology research, debates and commentaries no longer question (see Fool's gold, A critical look at computers in childhood, 2000) whether technology should be fully integrated in childhood education but encourage it. Some encourage it through integration of STEM into early learning (Dossani, 2016). Major research funding agencies (such as the Caplan Foundation for Early Childhood Education, National Science Foundation (NSF), U.S. Department of Education, etc.), support research in early childhood education that seeks to increase knowledge and skills on how best to integrate STEM concepts in early learning curriculum. Some of the central questions now are: what type(s) of instructional technology tools are developmentally appropriate and effective to use with young children? How do teachers use technology with young children to enhance STEM concepts? These questions could be answered when teachers are consistent in collecting and analyzing instructional technology assessment data. Of all the concepts of STEM, this paper focuses on technology integration and use with young children.
Early childhood teachers integrate computer technology in the classroom using interactive software and media (McManis & Gunnewig, 2012;NAEYC, 2012;Zaranis, Kalogiannakis & Papadakis, 2013) to support activities such as; virtual field trips, simulations, webquests, and educational games (Jenkinson, 2009;ISTE, 2002;Ntuli & Kyei-Blankson, 2010;NAEYC, 2012). Research indicates that in addition to school district-purchased software, early childhood teachers spend time sifting through the Internet in search of additional online programs (or open source software) that are developmentally appropriate to infuse in their classroom activities (Shamburg, 2004). With all the effort that teachers are making to integrate technology, little is known about the effectiveness of such technology in young children's learning process and ability to transfer knowledge (Jenkinson, 2009;Ntuli & Kyei-Blankson, 2012;Shields & Behrman, 2000). The following summary of literature review is on technology integration in the classroom and evaluation of instructional technology.

Literature review 2.1 Technology integration in the classroom
The bulk of literature related to technology integration in the classroom focuses more on the theoretical design of instruction that integrates technology. It is assumed by such theoretical designs that evaluation of instructional technology is considered and embedded within the instructional planning process. Instructional technology design theory (ITDT) taught during teacher preparation follows such assumptions and it is insufficient; most teacher candidates on field experience/internship and some novice teachers struggle with application of the theory in authentic classrooms because they lack both personal and vicarious experiences (Brown, 2006;He & Cooper, 2011;Ertmer, 2005;Judson, 2006;Ma, Williams, Perejean, Lai & Ford, 2008). The reality is that most technology courses offered in teacher prep programs lack a meaningful field experience component (Bucci & Petrosino et. al., 2004;Ma, et al., 2008). In most cases, teacher candidates take field experience courses after completing technology courses which are mostly theoretical in nature. In ITDT there are integration models (such as the dynamic instructional design (DID) model by Lever-Duffy and McDonald (2011) and the technological pedagogical content knowledge (TPACK) model by Mishra and Koehler (2006)) that if applied appropriately in authentic classrooms, they result in effective technology integration. Based on the aforementioned theoretical frameworks, it is important to study how teacher candidates apply theory in the classroom, specifically, how they decide on the types of technology to use and/or how they evaluate the effects of such technology before, during, and after lesson implementation. Such feedback is important in order to evaluate the impact of instructional technology models used to teach technology integration and assessment/vetting/evaluation of instructional technology tools. This study is timely because millions of dollars are being invested in learning with technology in P-12 schools to boost STEM education (Amiel & Reeves, 2008;Bohlin, 2002;Chen, 2004;STEM for All, 2016); therefore, practical knowledge of how to evaluate the effectiveness of instructional technology tools invested in schools is necessary. Without the practical knowledge, lot of STEM funding and time will be wasted on tools that are not effective.

Evaluation of instructional technology
The issue of evaluating the effectiveness of educational technology tools has been raised mostly by researchers beyond the early childhood discipline (Karolcik, Cipkova & Hrusecky, 2015;Jenkinson, 2009;Robson & Schraw, 2008). Karolcik et al. (2015: 243) note that "despite the fact that digital technologies are more and more used in the learning and educational process, there is still lack of professional evaluation tools capable of assessing the quality of digital teaching aids in a comprehensive and objective manner". Jenkinson (2009: 263) argued that the current ways of evaluating the efficacy of educational technology are failing to "capture complex interactions that occur between the learner and the object". Robson and Schraw (2008) note that current studies that attempt to measure the effectiveness of educational technology report varied results. Some of the empirical research that supports the use of certain types of technology in learning is not founded on good research design and the results are flawed and biased. Though some of the results from such research are generalizable, the studies that led to the results have not been replicated to find out if the findings are reliable. Along the same lines, Reeves (2007: 274) argued that many of such studies are "one of quasi-experimental studies that are not linked to any particular research agenda". Chingos and Grover (2012) indicated that determining the effectiveness of any type of instructional materials through large-scale randomized experiments is rare because it is expensive and time consuming. In addition, they (ibidem: 6) argued that "many instructional materials have not been evaluated at all, much less with studies that produce information of use to policymakers and practitioners…this problem … worsens with the explosion of open-source web-based instructional materials". Most importantly, most of the studies have not considered this issue from the perspectives of early childhood teachers and very few of the studies have been carried out through the conduct of qualitative research. This article is based on the qualitative research conducted with pre-service early childhood teachers on how they evaluate the technology tools they use at K-3 rd grade level during field experiences. The study sought to find out the ability of teacher candidates to apply theory into practice, specifically, the methods and process utilized in the evaluation of instructional technology tools before, during, and after implementation with students.

Research question
The overarching question that guided the study was: What strategies do early childhood teacher candidates use during internship to evaluate the effectiveness of the instructional technology materials before, during, and after use in their classrooms?

Significance of the study
This study is essential in that it contributes to the literature on teacher evaluation of technology tools by exploring this issue from the early childhood teacher candidates' perspective. It is important that teacher candidates learn from cooperating teachers alternative ways in which they might evaluate technology tools and programs, especially, in regards to open source webbased instructional materials that have so much to offer in the instruction and learning process. In addition, this study may help early childhood teacher preparation and technology professional development programs to reflect on how they infuse evaluation methods of instructional technology in the curriculum.

Research design, context, and participant selection
In this study, a purposeful qualitative study which used typical sampling was employed to explore the strategies teacher candidates use to evaluate the effectiveness of instructional technology during internship (or field experience). Typical sampling in a purposeful study involves the selection of participants that best represent the population and the phenomenon under study (Edmonson & Irby, 2008;Merriam, 2009). It is important to note that early childhood refers to children from birth through age 8 (NAEYC, 2012). This study purposefully selected teacher candidates preparing to teach 5-8 year olds (K-3 rd grade). Participants of the study included fifteen K-3 teacher candidates (three kindergarten, four 1st grade, three 2 nd grade, and five 3 rd grade teachers) who were enrolled in the college of education in a southeast Idaho university and had completed an instructional technology methods course and their first field experience. During field experiences, the teacher candidates were paired with a mentor teacher (or cooperating teacher). It is important to note that among the participants, four (one 1 st grade, one 2 nd grade, and two 3 rd grade candidates) were in a blended early childhood program preparing to certify to teach in both general education and special education classrooms. The four candidates were placed in special education classrooms. The assumption in this study was that by collecting data from teacher candidates who were supervised by cooperating teachers (CTs), somehow, the study would capture what the pre-service teachers observed from CTs. (Informed consent was obtained from all participants of this study). Participation in the field experience required that the teacher candidates spent a total of 150 hours in the classroom. The candidate was to help with designing and planning for technology integrated lessons and activities, and implement at least six activities while being monitored by the CT (two of the six activities were formally evaluated by university supervisors). The field experience internship provided the teacher candidates with an opportunity to apply new technologies and technology integration methods learned from the instructional technology course into a real-life classroom, and to observe and learn how CTs integrate and evaluate the effectiveness of such technology tools.

Summary of technology evaluation process
Teacher candidates were expected to apply what they learned in instructional technology courses. There are processes that teachers should take before they integrate technology in the classroom. Table 1 summarizes the processes expected from early childhood teachers to ensure that they collect functional data about the appropriateness and effectiveness of tools they use with young children. The summary is developed from extensive research-based literature review that focuses on developmentally appropriate technology and early childhood technology evaluation instruments (Children's Technology Software Review, 2014;NAEYC, 2012;Haugland, 2005;Haughland & Wright, 1997;Haugland & Shade, 1994;Ntuli & Kyei-Blankson, 2012;Ntuli & Kyei-Blankson, 2013;Wardle, 2002). Information in Table 1 was used to develop the coding instrument to be discussed under data collection and analysis. Search for information about the tool on the Internet (specifically how to operate the tool with young children, whether the tool is developmentally appropriate* and could be customized).
Read reviews online about how other teachers have used the tool (write down positives and negatives) Read peer-reviewed practitioner journals on the use of such technology with young children.
Reflect on how to minimize the negatives if you were to use it in your classroom context.
If the negatives can be minimized, plan for integration of the tool using an instructional design model.
Develop an observational tool that will allow documentation of information about the tool as students use. (Include a checklist with desired behaviors or skills that students should be able to attain as they use the tool).

During Implementation
Monitor the students by moving around and observe how they use the tool.
Use the observational tool to document what you see and hear. (The observational tool can include a checklist with pre-determined desired behaviors or skills that the teacher wants the student to attain as a result of using the tool, on the same observational tool, space for open-ended comments can be provided to document what the teacher sees/hears).
As you move round, ask students if the tool is helping them to complete their task with ease or not.

After Implementation
Review students' final products.
Review students' grades before and after implementation of the tool Interview students about the tool in groups and individually.
Review students' journal entries about the tool. Guided reflection questions are important at elementary level. For instance, I like spellingcity.com because… or I had a difficult time using… Develop a rubric with students for self-evaluation after using the tool. *Developmentally appropriate software is based on the following criteria: The content is age appropriate, the vocabulary is age appropriate, the software provides problem solving opportunities, the program begins with what children already know and gradually introduces the concepts, the software does not provide undesirable behaviors, the software encourages active involvement and stimulates the child's interest, the instructions are easy to follow, the program is easy to navigate and allows children to use the program independently, and the software allows children to make changes in the environment without receiving threatening feedback. The described criteria align with official description of age appropriate materials (NAEYC, 1996) and technology and media for young children (NAEYC, 2012

Data collection and data analysis
Qualitative data were collected in two phases. In the first phase, data were collected from the teacher candidates' field experience reflective essays. Specifically, the researcher used a section of the reflective essays which required candidates to address the assessment/evaluation of instructional technology materials before, during, and after implementation. In the first phase data were coded and themes were generated and organized into categories and subcategories (Saldana, 2009). It is important to note that information in Table 1 was used to further organize themes into categories and subcategories. In the second phase, one-on-one semi-structured interviews were conducted to clarify data collected from reflective essays. The interviews with teacher candidates lasted approximately forty-five minutes each. Probing questions were used to acquire an in-depth understanding of the phenomenon under study. Data from the interviews were coded and analyzed for themes and patterns using open coding (Creswell, 2011;Saldana, 2009).

Establishing credibility and trustworthiness of data collected
To establish credibility and trustworthiness of qualitative data, the researcher used "triangulation" of data and "member checking" (Edmonson & Irby, 2008). Triangulation is a technique where the researcher engages in cross checking of data sources, and interpretation (Kreftings, 1991). In this study, triangulation involved cross checking data from the reflective essays, and interview data. Member checking involves giving the participants data to review for accuracy and check for inconsistencies (Edmonson & Irby, 2008;Kreftings, 1991;Lincon & Cuba, 1985). In this study the participants reviewed the transcribed interview data to confirm or disconfirm the reliability of the interpretations derived from the qualitative data. The reflective essay assignment guidelines were reviewed by a panel of instructional technology instructors to ensure content validity, and alignment with the program standards and ISTE technology standards for teachers. The interview protocol was reviewed by subject matter experts to ensure reliability and validity of the study (Cuba, 1981).

Findings
Data from the study were coded and used to answer the research question: What strategies do early childhood teachers use to assess the effectiveness of the instructional technology materials/tools before, during, and after instruction? Table 2 summarizes the major findings to be discussed in detail.

After lesson implementation
Reflective Essays

Review student grades
Interviews Student on-on-one or group interviews ✓ -indicates use of some evaluation strategy; x -no evaluation strategy reported; Ggeneral education classroom; S -special education classroom.

How do you evaluate the appropriateness and effectiveness of technology before integration in the lesson?
The findings from the reflective essays show that irrespective of the grade level, none of the teacher candidates were involved in evaluating the technology tools prior to it being incorporated in the lesson. In the reflective prompt they were asked to explain how they evaluate technology tools before integration in the lesson. The following excerpts come from teacher candidates' reflective essays: It was surprising to read Excerpt 1, because all teacher candidates were expected to participate in the planning process. A follow up in the interview revealed that the teacher candidates had minimal participation in the planning process (except for a few lessons that the candidate actually implemented) as that was completed during the weekends or after hours by the CT. Excerpt 2 indicates that the teacher candidate believed an evaluation was completed by another teacher before implementation; therefore, the tool is appropriate for use in the same grade level. This is contrary to early childhood best practices which advocate for the need to engage in reevaluation of instructional materials by individual teachers (Aldridge & Goldman, 2007;Coople & Bredekamp, 2009;NAEYC, 2009). It is important to note that being developmentally appropriate does not imply that the tool is effective. The process of reevaluating instructional materials ensures that materials are adapted to meet the needs of individual students in a particular classroom context. All learning is situated, therefore, what works for one teacher may not work for another depending on the needs of the students.

How do you assess the effectiveness of technology during lesson implementation?
Findings from reflective essays corroborated by teacher candidate interviews show that, typically teacher evaluations during implementation were in the form of observations.
In a follow up interview, one first grade teacher candidate said: "As students worked on the computers, I moved around monitoring if they were able to play the game … sometimes, I played the game to demonstrate how they should do it …".
One kindergarten teacher candidate said: "… at the technology center there is always someone to help monitoring the kids to complete the task … we were three in the classroom [the CT, teacher candidate, and an aid)".
Though most teacher candidates indicated that they used observations, third grade teacher candidates and two special education candidates in 1 st and 2 nd grade indicated that they documented their observation data using checklists and anecdotal notes.
A 3 rd grade teacher candidate in a special education classroom said: "We [the CT and teacher candidate] planned ahead and wrote the skills on a checklist that students should be able to meet when they play the game … here is a checklist from my lessons [displaying a sample checklist (using an Ipad) from her technology integration e-portfolio]. This is very effective because when we looked at the checklist information [data], say over a period of two weeks, we were able to tell if the game is working [effectively] for the kids or not".
A 2 nd grade special education candidate said: "My CT advised me to write anecdotal notes as I moved around observing the students. At times we needed to reflect on what we saw students doing or the questions that students asked as they worked on the computer. If we did not write notes we wouldn't remember exactly the problems we saw".
Looking at the fact that all those placed in special education classrooms used some form of documentation of what they observed leads one to falsely believe that special education teachers are expected to track their students' growth more than general education teachers. Anecdotal records and checklists are highly encouraged at early childhood level when collecting data through observations (McAfee & Bodrova, 2006;McDevitt & Ormorod, 2013) because they allow the teacher to have data for reflective thinking and decision making on whether the technology tool is effective or not.

How do you evaluate the effectiveness of technology after lesson Implementation?
Findings show that evaluation of the effectiveness of technology after lesson implementation involved reviewing and grading students' assignments (or products) to make decisions concerning the effectiveness of the technology used.
In the reflective essays, one kindergarten teacher candidate wrote: "...after grading I check to see if students' scores are high. If the performance is high it means the technology is working." A first grade teacher candidate wrote: "…it depends on the quality of what the students are able to produce using the technology …".
Relying on such strategies alone is limited because teachers are not able to account with certainty if the objectives were met as a result of using the technology. It could be that objectives were met due to other instructional materials that supplement the instructional technology materials used in the class.
Interviews also provided another dimension that was not mentioned by teacher candidates in K-2 grade level. Teacher candidates at third grade general classrooms indicated that they interviewed their own students to learn the extent to which students thought the technology was effective.
One teacher candidate said: "After using technology … I usually have a one-on-one interview with students … I sample students … I can't interview all the students…only the high achieving, the mediocre, and the low achieving. I ask questions such as -was the game [ Interviewing or questioning students to determine the effectiveness of the technology tools in the classroom is one way that is highly recommended (Robyler & Doering, 2013). It is important to introduce higher-order questioning from kindergarten so that young children develop critical thinking skills that help them make good and appropriate choices. Questioning students about the technology tools should not only come at the end of the lesson; teachers should ask questions about how the tool is working during the lesson (Lever-Duffy & McDonald, 2011). Lever-Duffy and McDonald emphasize the need for continuous feedback from students when teaching with technology. This helps with the overall feedback required to make decisions on whether to continue integrating that specific technology tool. In some cases, teachers may decide to continue with the integration, however, with the use of scaffolds depending on what the student interview data would have suggested.
Assessment of instructional materials after the lesson should provide a holistic picture of the effectiveness of the instructional materials. In addition to what the candidates mentioned, teachers may use alternative strategies such developing rubrics and electronic portfolios (Barret, 2001;Roblyer & Doering, 2012). Teachers need to be encouraged to develop technology rubrics that may be used by both the students and the teachers to evaluate the effectiveness of the technology (Ntuli & Kyei-Blankson, 2013). Electronic portfolios where students' work is collected over time should include artifacts such as reflective notes on instructional technology tools that helped the students to accomplish the task. With young children the teacher can use guided reflective prompts (such as, "I like using … to learn my letter sounds or the program … was helpful in learning about fractions"). Such kind of portfolio artifacts helps the teacher when reflecting on the effectiveness of technology tools integrated in the classroom over time.

Unique findings
One reflective essay documented one kindergarten special education teacher's way of assessing the effectiveness of technology during and after the lesson. The teacher candidate described how the CT adapted concurrent time series probe approach (CTSPA) which has been found to help teachers with technology outcomes documentation (Parette, Blum, & Boeckmann, 2009;Smith, 2000). The CTSPA has been used in documenting the effectiveness of assistive technology and it involves the teacher in collecting performance measures of a child completing a specific activity; both with and without technology over a period of time, with the teacher making a decision about a reasonable length of time to collect the data (Edyburn, 2002). The candidate observed the CT collecting student performance data with and without technology for a month to find out the difference (increase or decrease) in the number of students meeting objectives. Collection of authentic assessment data about the tool and student performance over a period of time for decision making is highly recommended in early childhood education (Johanson, Bell, & Daytner, 2008;McAfee, Leong & Bodrova, 2006). In this case, not only did the candidate learn about and evaluation strategy but the importance of having a data storage system in place to easily store and analyze the effectiveness of the instructional technology tools. Such kind of unique experiences during practicum is enriching to teacher candidates.
Another unique strategy emerged from interviewing a third grade teacher candidate. The candidate indicated that they invented the use of the red-cards-up strategy with students. The candidate described the red-cards-up as a technique where students are required to individually raise a red card during the lesson as a way of alerting the teacher when they need scaffolding. The more the teacher has red cards up in a technology-integrated activity; the more likely it is that the technology tool is not effective. This strategy has a potential to be effective because of the notes in front ("I will use the tool again") and back ("I will not use the tool again"). If one uses the technique appropriately and consistently, they may be able to gather effective assessment data that measures the effectiveness of instructional technology materials. The notes on the cards play an important role in helping young students to decide if they will use the tool again. The teacher collects the cards in two piles at the end of the lesson (organized by students' choice either front or back) for further documentation about the tool.

Discussion
In this study, all teacher candidates did not participate in the evaluation of technology tools before implementation. It is troubling considering the fact that teacher candidates learned (in technology integration course) about the processes they need to take to ensure the appropriateness and effectiveness of the tools they use with young children. In addition, the teacher candidates were paired with CTs so that they could learn from them. However, data indicates that some teacher candidates had little interactive planning time with the CTs. This defies the main purpose of field experience for teacher candidates; that is to have authentic classroom experience with mentorship. Intentional and focused communication between the CT and instructional technology instructors is needed to clarify the role of teacher candidates in the classroom, and a discussion of the nature of reflective essays or any other artifacts from the field experience is necessary. Not taking away the credit from some CTs who had unique strategies that they shared with teacher candidates, it is expected that CTs provide more of such opportunities for candidates. If the CTs have limited technology pedagogical and assessment knowledge, they need to be encouraged to take technology professional development.
While data from this study did not yield information on how teachers evaluate instructional technology prior to the lesson, early childhood education research strongly encourage evaluation of technology before implementation to ensure that it is aligned with early childhood curriculum and integration methods. Though most teacher candidates used observation strategy to evaluate technology during implementation, they did not practice rigor in documentation of the observed data, and there is no consistency across grade levels in terms of the strategies used. Overall, the study reveals that teacher candidates have limited strategies and skills to evaluate technology in real classrooms despite comprehensive preparation during teacher training. One teacher said they forgot to use the evaluation instrument that they learned about in one of the technology courses. Early childhood teacher training programs need to encourage teachers to engage in the process of evaluating instructional technology all the time to ensure that instructional materials are developmentally appropriate, and that they are helping diverse students to achieve the learning outcomes. The argument this study brings forth is that early childhood cooperating teachers need to apply assessment strategies and techniques consistently with teacher candidates; the strategies should align with what is advocated by early childhood research and best practices when it comes to the evaluation of the effectiveness of early childhood instructional technology materials. This should be reinforced in professional development programs. The professional development curriculum may infuse alternative evaluation strategies such as those presented in Table 1 that are based on extensive review of early childhood educational technology and media materials (Buckleitner, 1999;NAEYC, 2012;Haugland, 2005;Haughland & Wright, 1997;Haugland & Wright, 1997;Haugland & Shade, 1994;Ntuli & Kyei-Blankson, 2012;Ntuli & Kyei-Blankson 2011).

Conclusion and recommendations
Teacher preparation and professional development programs have a task to bring awareness to teacher candidates and in-service teachers about the importance of evaluating instructional technology materials, before, during, and after technology integration. Even though literature reveals how challenging it is to assess the impact of instructional technology materials, that should not encourage early childhood teachers to adopt instructional technology materials without individually assessing if they are effective enough to meet the diverse needs of the students in different classroom contexts. Given the potential that technology has to increase cognitive developmental gains in early childhood, and to support a variety of learning styles, empirical research that examines alternative strategies currently used to evaluate the effectiveness of early childhood instructional technology materials is crucial. Such feedback is not only necessary to compile evaluation strategies that work but also to categorize efficient early childhood instructional technology tools. Those who make software would be more focused in developing functional technology for early childhood education.

Introduction
The Greek population has experienced socioeconomic changes with a clear psychological impact, mainly since 2010, when the global financial crisis affected many countries of Europe, including Greece (Giotsa & Mitrogiorgou, in press). Children are a vulnerable population group and, according to Anagnostopoulos and Soumaki (2012), the psychological effect of crisis is obvious, not only by the child psychiatric services' data, but also by children's behavior within their environment (family, school, social life). Specifically, Soumaki (2012, 2013) argue that there are increased percentages of psycho-social problems in childhood (rise by 40%).
Preschool age, is considered as the best time for an early detection of internalizing problems as well as the right period for an early intervention, in order to deal with them (Achenbach & Rescorla, 2009;Cole et al., 2008;McCabe & Altamura, 2011). In Greece, that schooling begins at the age of four and the time spent in a preschool has increased (all-day preschool), the school context has become an important setting where these types of difficulties can be detected. In addition, the importance of evaluation in preschool age is enhanced by the increased rates of internalizing problems in children (Briggs et al., 2013;Knitzer & Perry, 2009). Finally, longitudinal research confirms that internalizing problems in preschoolers are stable and continuous during childhood (Achenbach et al., 1987;Fuchs et al., 2013;Keenan et al., 1998) and adolescence (Bosquet & Egeland, 2006;Strickl et al., 2011) for about 23% to 61% of the children.

Internalizing problems
In Caregiver-Teacher Report Form (C-TRF) for ages 1½ -5 (Achenbach & Rescorla, 2009), the main data collection tool in our research, the term internalizing problems refers to problems of a child's inner world (Achenbach & Rescorla, 2000). Internalizing problems include syndromes such as emotional reactivity, depression/anxiety, somatic complaints and withdrawal (Achenbach & Edelbrock, 1984). According to Manolitsis and Tafa (2005), there are no age differences in the incidence of internalizing problems in preschoolers. However, research results show that it is more likely for girls to exhibit these problems (Beidel et al., 2000;Beyer et al., 2012;Kazdin, 1995;Manolitsis & Tafa, 2005;Morgan et al., 2008;Ollendick & King, 1991). The type of school (all-day or half-day preschool) is also a factor linked to internalizing problems exhibited by preschool students, since extended school time may be stressful and exhausting (Emery et al., 1998;Mashburn & Henry, 2004). Finally, Pianta et al. (2005) argue that positive emotional interactions can be more easily developed in preschool classes with a small number of students.

Emotional reactivity
Emotional reactivity refers to the tendency of people to experience frequent and intense emotional stimuli (Karrass et al., 2006;Rothbart & Derryberry, 1981), which help them achieve their objectives and adapt to different environments (Campos et al., 2004). According to Achenbach, emotional reactivity is an internalizing problem which can be tested by using detection tools constructed by him and his associates (Achenbach, 1991;Achenbach et al., 2003;Achenbach & Rescorla, 2009).

Depression/anxiety
Depression, as a separate syndrome is a psychiatric mood disorder characterized by excessive sadness and loss of interest in activities normally pleasant for the person (Liu et al., 2011). Anxiety can be described as a "state of worry without apparent cause" (Johnson & Melamed, 1979). Anxiety disorders are the most common type of psychiatric disorders in children (Costello & Angold, 1995), with separation anxiety disorder and selective mutism disorder appearing only in children (American Psychiatric Association, 1994). In the ICD-10 (International Statistical Classification of Diseases-10 th revision) of the World Health Organization (World Health Organization, 2010) depression and anxiety are mentioned together, as a single syndrome, called mixed anxiety/depressive disorder. This study is based on the categorization developed by Achenbach, where depressive and anxiety disorders are not examined separately, but as a single syndrome with the name "Depression/anxiety", classified as an internalizing problem (Achenbach, 1991;Achenbach et al., 2003;Achenbach & Rescorla, 2009).

Withdrawal
The term (social) withdrawal is used to describe the situation in which a child exhibits a systematic tendency to avoid peers and be isolated (Rubin & Coplan, 2004). This behavior may occur even if the peers are not strangers, but some familiar people (Hart et al., 2000;Rubin et al., 2002). Children with withdrawal behavior speak much less, when interacting with others, than children who do not exhibit such behavior (Schneider, 1999). Moreover, they have deficits in social competence and in cooperation skills (Bohlin et al., 2005).

Preschool teachers' perception of internalizing problems
In general, preschool teachers often state that the internalizing problems of their students are too demanding and difficult to deal with (Nutbrown & Clough, 2004). According to Poulou (2013b), teachers' perceptions of internalizing problems exhibited by their students, as well as the interpretations of the things that cause them, (temperament, family, school or wider social environment) will determine to a significant extent, the way a preschool teacher confronts them. In particular, Liljequist and Renk (2007) argue that, for preschool teachers, the causes differ depending on the type of the problem. Lovejoy (1996) states that teachers tend to attribute the development of internalizing problems to internal and more stable reasons, related to child's temperament and not to the different environments in which a child acts (family, school, peers, neighborhood).

Early detection
The stability of internalizing problems and their continuity over time are two factors that make early detection in preschool very important (Feeney-Kettler et al., 2010). Different researchers (Costello et al., 2003;Mesman et al., 2001;Richman et al., 1982) argue that the children who had developed internalizing problems in preschool age, continued to exhibit these problems later in their life.

Main research purpose and specific objectives
The main purpose of our research is to focus on the early detection of preschool children's internalizing problems, according to their teachers' perceptions. Specific objectives of our research were to investigate the factors influencing the detection of internalizing problems by preschool teachers. These factors are related to: (a) the specific characteristics of their students (gender, age); and (b) the characteristics of the school unit (type of school, number of children in the classroom.

Research hypotheses
Hypothesis 1. The classification of students to normal, borderline and clinical range, for the separate syndromes (emotional reactivity, depression/anxiety, somatic complaints and withdrawal) and internalizing problems depends on the gender.
Hypothesis 2. The classification of students to normal, borderline and clinical range, for the separate syndromes (emotional reactivity, depression/anxiety, somatic complaints and withdrawal) and internalizing problems, depends on their age (4-5, 5-6 years old).
Hypothesis 3. The classification of students to normal, borderline and clinical range, for the separate syndromes (emotional reactivity, depression/anxiety, somatic complaints and withdrawal) and internalizing problems, depends on the type of school (half-day, all-day).
Hypothesis 4. The classification of students to normal, borderline and clinical range, for the separate syndromes (emotional reactivity, depression/anxiety, somatic complaints and withdrawal) and internalizing problems, depends on the total number of children in the classroom.

Sample and design
The sampling method selected for the particular sample was random sampling in groups or "blocks" (cluster sampling) (Paraskevopoulos, 1993;Tomaras, 2005). This method was considered appropriate because it is used when the role of geography is important (Tomaras, 2005). In this study, the geographical coverage included all the thirteenth regions of Greece. The population consisted of all the children aged 4-6 years 1 , who attended preschool during the academic year 2011-2012 in every public preschool of Greece (Hellenic Statistical Authority, 2012). On a first level we selected all the thirteen (13) geographical regions of Greece and, on a second level, we randomly selected seventy seven (77) half-day and all-day preschool classes form all the regions. Preschool teachers, after they had been informed for the purpose of the research, completed: (a) C-TRF (Achenbach & Rescorla, 2009) for every child in their class, and (b) a "Demographic Questionnaire" (Doni, 2015). These questionnaires were completed after class. After they had completed the questionnaires, they sent them back via post (pre-paid envelope). The data collection took place during the academic year 2011-2012.

Instrumentation
(A) "Caregiver-Teacher Report Form (C-TRF) for ages 1½ -5", based on the Achenbach System of Empirically Based Assessment (ASEBA) (Achenbach & Rescorla, 2009). At the beginning, C-TRF has demographic questions followed by 97 closed-ended questions which reflect the opinions of educators (preschool teachers, childminders or people who take care of children of this age) about internalizing and externalizing problems. Finally, it includes three open-ended questions which are not scored. C-TRF enables a quality assessment of children, classifying boys separately from girls, through cutpoints, to those who belong to the normal range and those who belong to the clinical range. Between normal and clinical ranges, there is one called borderline range. The borderline range indicates that the rating of the child, to one or more syndrome scales, is high enough to create concern about providing the child with professional help, however, it does not deviate as much as a score that is in the clinical range (Achenbach & Rescorla, 2009, pp. 89-112). C-TRF has been "weighed and translated into more than 60 different languages" (Achenbach & Rescorla, 2009: 13), while its scientific documentation has been recorded, until now, in more than 8610 scientific articles (ASEBA, https://bib.aseba.org, 2014). C-TRF's scales are harmonized with the diagnostic categories of DSM-IV (Achenbach & Rescorla, 2009). For the present study we used the Greek version of C-TRF which was adjusted, validated and weighted in Greek by Ioannis Tsaousis in 2003 (Achenbach & Rescorla, 2009). Filling in this form does not require any special training, since the instructions are clear and helpful, so that teachers can perform the assessment quickly and easily.
(Β) The "Demographic questionnaire" (Doni, 2015). This questionnaire was designed by the researcher, for the purposes of the present research. It is not commercially available and consists of 11 closed-ended and open-ended questions that refer to demographic data of preschool teachers, as well as information on the type of school (half-day, all-day) and the number of students in the classroom.

Method of statistical analysis
The classification of children in the clinical, borderline normal range for emotional reactivity, depression/anxiety, somatic complaints and internalizing problems or syndromes group was correlated with the categorical variables which were recorded using the Pearson Chi Square criterion and in cases where the conditions were not met, the Fisher's Exact test. For the correlation with continuous sample measurements, we carried out the necessary normality tests with QQ plots and the Shapiro Wilk criterion. We used one-way Analysis of Variance (ANOVA) or the Kruskal Wallis criterion and then multiple comparisons with the Bonferroni or Dunnets criterion respectively, depending on the fulfillment of conditions Results were analyzed with the use of multinomial logistic regression models (Garson & Anderson, 1982). In these models, the different syndromes and the internalizing problems were defined as dependent variables, by classifying children into normal, borderline and clinical range. The particular characteristics of the children [gender, age (4-5 and 5-6 years old)], the type of school (half-day, all-day) and the number of students in the classroom were defined as independent variables. Table 3, the overall rate of internalizing problems, exhibited by the students of our sample, was 10.4%, 3.5% of whom was included in the borderline range and 6.9% in the clinical range. As regards the separate syndromes, according to our results, 8.1% of the students experienced emotional reactivity, 6.9% of whom was included in the borderline range and 1.2% in the clinical range. 3.9% experienced anxiety/depression, 2.8% of whom was included in the borderline range and 1.1% in the clinical range. 7.7% of the students exhibited somatic complaints, 4.9% of whom was included in the borderline range and 2.8% in the clinical range and 10.9% experienced withdrawal, 8.1% of whom was included in the borderline range and 2.8% in the clinical range. Table 4 summarizes the means and ranges of the sample's scores of emotional reactivity, depression/anxiety, somatic complaints and withdrawal. Statistically significant differences (Table 5) were observed in the distribution of normal, borderline and clinical cases, for internalizing problems in relation to: (a) the total number of children in the classroom, χ 2 =19.08, p = ,000 and (b) the children' s gender, χ 2 = 6.9, p = ,032. Statistically significant differences were found in the distribution of normal, borderline and clinical cases for:

Discussion
The main purpose of this research is to focus on the early detection of preschool children's internalizing problems, according to their teachers' perceptions. As seen from the results, 94.6% of the children in our sample is included in the normal range. These findings agree with the research findings of Berkhout, Dolk and Goorhuis-Brouwer (as reported by Berkhout et el., 2012) and Berkhout et al. (2012), in which more than 90% of the children in their sample (98 and 96% respectively), was included in the normal range. These variations occur, probably, due to the much smaller sample size in these foreign studies, which consisted of 228 preschool children, while ours consists of 1,234 children 4-6 years old (Μ=5.65, SD=0.64).
The prevalence of internalizing problems, in the children of our sample is 10.4%, of which 6.9% is included in the clinical range, while 3.5% is included in the borderline range. Our research findings are consistent with Harden et al. (2000), who confirmed that 6.5% of the children in their sample has internalizing problems in the clinical range. The data analysis, especially for each syndrome separately, shows that in most syndromes the clinical range rates are between 1.1 and 2.8%, while borderline range rates are between 2.8 and 8.1%. However, withdrawal receives the highest rate, since 10.9% of the children in our sample, exhibits this syndrome. 2.8% of these children is included in the clinical range and 8.1% in the borderline range. In Kontopoulou's (2003) research, which explored the views of preschool teachers in relation to major behavioral problems associated with the adjustment of the child at school, withdrawal received the highest rates (56%).
Anxiety/depression received the lowest rate (3.9%) according to our results, 1.1% of these children is included in the clinical range and 2.8% in the borderline range. The findings of this study are consistent with the findings of many (non-clinical) surveys, in which the rate of major depressive disorders does not exceed 2,0% in preschoolers (Egger & Angold, 2006;Kashani et al., 1997;Liu et al., 2011;Morgan et al., 2014). The low rate of anxiety/depression is, probably, due to the fact that, internalizing problems are often ignored in school contexts, because children do not easily express them to their teachers (Rescorla et al., 2007). Moreover, the symptoms of depression in preschoolers usually go unnoticed because children "cannot verbally express unpleasant emotions" (Poulou 2013a: 148).

The effect of gender
There are statistically significant differences between gender in some syndromes and groups of syndromes (somatic complaints, withdrawal and internalizing problems), while in other syndromes these differences are not statistically significant (anxiety/depression and emotional reactivity). Our research findings are consistent with the research of Berkhout et al. (2012) and Furniss et al. (2006), where boys also exhibit higher rates of internalizing problems. However, in most surveys, boys exhibit lower rates of internalizing problems (Beidel et al., 2000;Beyer et al., 2012;Kazdin, 1995;Manolitsis & Tafa, 2005;Morgan et al., 2008;Ollendick & King, 1991). The different rates between the two genders (as they appear in the different studies), in the incidence of internalizing problems, occur, according to Renk (2008), due to the different standpoint of each informant. Winer and Philips (2012) report that a lot of research in elementary students confirms the existence of bias in teachers' evaluation of the behavior and performance in boys and girls. For example, girls' reading and math skills are often overestimated compared to boys'. Moreover, girls usually receive less criticism and establish, less often, confrontational relationships with the teacher of the same sex.

The effect of children's age
According to our research results, four hundred and thirty seven (437) out of the one thousand two hundred and thirty four children (1.234) (35.4%) are 4-5 years old, and seven hundred ninety seven (797) (64.6%) are 5-6 years old. From the correlation analysis between the children's age and the onset of the syndromes, the internalizing problems and the total problems, it does not occur any statistically significant difference. According to Poulou (2013b), research on age differences in child behavior during preschool years, is quite limited and the picture is rather obscure. The results of this research, however, are consistent with the research findings of Manolitsis and Tafa (2005), who did not observe any differences in the incidence of internalizing problems between 4-5 year olds and 5-6 years olds.

The type of school (half-day/all-day)
Particularly for emotional reactivity, 1.8% of the children who attended all-day preschool, was included in the clinical range, 9.7% was included in the borderline range and 88.5% in the normal range. 0.9% of the children who attended half-day preschool was included in the clinical range, 5.6% in the borderline and 93.5% in the normal range, respectively. It seems that extended school time, has a clearly negative (inclusion in the clinical range), or somewhat negative impact (inclusion in the borderline range) on the social and emotional functioning of the children in our sample. These findings are confirmed by other studies, according to which, the extended school time in preschool actually exhausts children (Emery et al., 1998), who consequently argue with their teachers and exhibit more internalizing problems, due to stress and frustration caused by the extended time spent at school (Mashburn & Henry, 2004).

The total number of children in the classroom
In classes with larger numbers of students there are more cases of children among the normal range either than the borderline or clinical range for all 4 syndromes and internalizing problems. Seeking for a possible interpretation of these findings, we would argue that classes with larger numbers of children seem to create dynamics that are likely to result in either: (a) the limited incidence of internalizing problems in children, or (b) the increased difficulty for teachers to detect any issues. Given that internalizing problems in preschool children are mainly associated with dysfunctional relationships (Henricsson & Rydell, 2006), the infrequent incidence of such problems in crowded classrooms could possibly be due to the existence of positive interactions among students.

Conclusion
Through this study, there has been an effort, for the first time in Greece, to explore, on a national level, whether preschool students exhibit internalizing problems which can be detected by their preschool teachers. The findings show that boys exhibit higher rates of internalizing problems than girls. Moreover, the prevalence of emotional reactivity seems to be higher in preschoolers who attended all-day preschool. Finally, internalizing problems are less frequent in classes with larger numbers of students. These results could be used as a reference point and as a point of comparison in future, more specialized studies on the social and emotional functioning of preschool children. Moreover, they could be useful in a future revision of the Greek analytical curriculum for preschool education, which should aim at the social and emotional development of children among other targets.

Limitations -Suggestions for future research
While considering methodological implications of the current study, the sampling design and in turn overgeneralization of the findings should be taken into account. The detection of internalizing problems in preschool children, with the use of psychometric tools by their teachers, carries a large degree of subjectivity, an element which exists almost as a "principle" in several studies which explore the perceptions of preschool teachers on this issue (Kleftaras & Didaskalou, 2006;Liljequist & Renk, 2007;Poulou, 2013b;Poulou & Norwich, 2000). In a future research, information for a child may be collected, from parents and other important key persons (e.g. a grandmother or a grandfather), independently or in combination with C-TRF, using a valid and reliable psychometric tool the "Child Behavior Checklist for ages1½ -5" (CBCL) of Achenbach, which has been also standardized for Greek populations (Achenbach & Rescrola, 2009;Rescorla, 2009). Moreover, in a different research caregivers could be included in the sample, in order to avoid the overestimation of parental reports (Carter et al., 2004). Future research could also examine the effect of specific variables-factors related to the family, on the incidence of internalizing problems in preschool children, as for example, parents' occupation, family income, family size, parents' age, child's place of residence and type of family (nuclear, extended, single parent, binuclear). Other factors such as the emotional climate of the family, parental psychopathology and rearing practices (parenting) are worth exploring. It would be also interesting if researchers examined the effect of variables related to the psychological state of preschool teachers (stress, depression, job burnout), on the way they detect internalizing problems exhibited by their students. Furthermore, the influence of factors associated with neighborhood and community, on the incidence of internalizing problems in children, is rather interesting for future research.

Introduction
The educational systems are in a constant pressure to change so as to effectively prepare students to meet the challenges of the 21st century. Given that technology has penetrated all aspects of our lives, the education included, the instructional use of computers is a reality that slowly but steadily, shapes new teaching methods, redefines the existing ones, and also changes, the content and the context of courses . An exception to the above is the teaching of English as a foreign language (EFL). Unlike other teaching subjects, the use of computers in the teaching of EFL is a reality for a number of years (Köksal, 2004). At the same time, new tools, as well as innovative uses of existing ones, provide new perspectives on how to significantly increase students' motivation (Pim, 2013) and make the EFL teaching more interesting and effective (Morris, 2011;Macaro, Handley & Walter, 2012).
Among these tools are computer games. It is more than obvious that games play a central role in children's lives. There is also a consensus in the literature that games can play an Computer Games and English as a Foreign Language: Results of a Pilot Study important role in education (e.g. Prensky, 2001a). Game-based learning (GBL) (Prensky, 2001b) can be applied in all levels of education and in almost all subjects. GBL's supporters believe that through games (digital or analog) most of the learning objectives can be achieved and that they have a significant impact on students' interest and motivation for learning (e.g., Bottino, Ferlino, & Tavella, 2007;Ke, 2008;Papastergiou 2009).
 Games are used in many teaching subjects.  The computer games were used without the teacher's intervention.  The results of the project were mixed.  Highly possitive attitudes towards the use of computer games were noted.  Computer games can be used in EFL teaching.
Taking into account the above, it was decided to examine whether computer games can support the teaching of EFL in mainstream primary school settings. Towards this end, a pilot project was designed and implemented, the results of which are presented in the coming sections. The paper is organized as follows. First, a brief review of the literature on GBL and on the use of computer games in EFL is presented, followed by the project's methodology and results. Subsequently, results are discussed and the conclusion completes the work.

Digital games in teaching
Nowadays, computer games are used by the majority of children and teenagers for their entertainment (Ofcom, 2013). Computer games also have an impact in education, to such an extent that Prensky (2004) claimed that they are the most powerful learning tools. Indeed, over the past twenty years, there is a surge in research in GBL (e.g., Felicia, 2012;Gee, 2014;Ke, 2009;Prensky, 2007;Squire, 2005). One of the main arguments for using computer games in teaching is that they provide experiences in environments that are rich, sophisticated, interactive, and have similarities with real-life conditions. Therefore, the learning experience, which is considered to be the basis for the construction of knowledge, is not simply transmitted but it is the result of reflection and interaction with the (game) environment (Braghirolli, Ribeiro, Weise & Pizzolato, 2016). In addition, students are encouraged to explore and experiment, which leads to the discovery of new concepts and strategies (Kirriemuir, 2002). Another important feature of computer games is that they provide immediate feedback; students can quickly see the results of their actions or if they answered correctly to a question (Prensky, 2001a). Moreover, students pay more attention to a learning activity when it occurs within a game (Garris, Ahlers & Driskell, 2002). Finally, it has been observed that when students play educational games, they tend to spend more time in trying to learn (Sandberg, Maris & De Geus, 2011).
While computer games are considered to be particularly effective at younger ages (Prensky, 2001b), there is no common consensus in the literature regarding their exact impact on students' learning outcomes. Indeed, the results of the relevant studies are mixed with some researchers reporting improved learning outcomes, others reporting a negative impact, and others reporting no impact at all (e.g., Perrotta, Featherstone, Aston & Houghton, 2013). On the other hand, most researchers agree that computer games have a positive impact on motivation, engagement, and problem-solving skills (e.g., Connolly, Boyle, MacArthur, Hainey & Boyle, 2012).
Also, researchers suggested that learning with games has to be supported by effective instructional strategies (Egenfeldt-Nielsen, 2006) and a well-developed games' pedagogy (Ulicsak & Williamson, 2010). This brings the discussion to the learning theories that frame the use of computer games. It is true that diverse learning theories embrace their use in teaching (Dondlinger, 2007;Wu, Chiou, Kao, Hu & Huang, 2012). The ones based on behaviorism, view learning as an associative process, in which reinforcement plays an important role in changing the observed behavior. This perception is evident in many games, which seek to exercise concepts or skills with repetitive practices (Braghirolli et al., 2016;Kebritchi & Hirumi, 2008). On the other hand, educational games based on constructivist perceptions, support learning through the active participation of players/students in the learning process. In this case, the main purpose of computer games is to achieve student-centered interactive experiences, that enable the construction of knowledge on students' own pace (Shute, Rieber & Van Eck, 2011), thus, redefining the relationship between students and teachers (Becker, 2005).
Coming to the teaching of EFL, it should be noted that computer games were considered useful at the very early stages of the integration of computers in education. Indeed, Dickinson (1981), was among the first to describe specific methods for harnessing the potential of role-playing games and simulations. In general, computer games are used in a variety of ways in the teaching of all foreign languages. For example, Connolly, Stansfield and Hainey (2011) used alternative reality games for telling stories in English, where the action changed depending on the participants' choices. Larsen (2012), used computer games as the sole means of instruction, without the interference of an instructor. Yolageldili (2011), noted that computer games have a positive impact on the correct use of language, both in terms of grammar and listening comprehension. Sylven and Sundqvist (2012), found that games improved the linguistic ability of students and also the level of understanding of English. Good results were obtained regarding the vocabulary of students aged 15-16 years who played games designed for this purpose (Sundqvist & Sylvén, 2014). Notably, these researchers proposed that computer games can be used even from younger students.
In general, the effectiveness of computer games in EFL teaching can be attributed to several factors. Firstly, they create a pleasant environment that reduces the stress that students feel when they learn a foreign language (Muhanna, 2012). They also provide opportunities to use the language in its natural context, particularly when it comes to multi-user online games (Benavides, 2001). Also, Escudeiro and Vaz De Carvalho (2013), argued that their effectiveness is because they enable users to learn from their own mistakes.
It is important to stress that most studies regarding the use of computer games in EFL teaching were based on games where the element of fun was second compared to the element of teaching. Therefore, it would be interesting to examine the results from the use of games where the element of fun is dominant. A second point that has to be noted is that the sample in the majority of studies was young teenagers and not primary school students. Thus, further studies are needed in order to examine whether computer games are equally effective to younger ages. Finally, in almost all studies that dealt with the use of computer games in EFL teaching and in mainstream school settings (primary or secondary), the games were used either as a supplementary material or in conjunction with an instructor. Therefore, it would be interesting to conduct a study in which teachers are eliminated or play a supportive role.

Method
Given that digital games present an interesting alternative method for teaching EFL to students, as presented in the preceding section, a project was designed and implemented in order to examine what the learning outcomes of such an endeavor might be, having as a target group fifth-grade primary school students (ages 10-11). The whole effort was based on the assumption that computer games can act as mediators between students and the learning material, allowing students to understand the subject through their own experiences, having control of their pace of learning, as the constructivist views dictate (Ertmer & Newby, 2013). A quasi-experimental design, with one experimental and two control groups, was chosen because data from intact classroom groups were analyzed for their differences in their learning outcomes, as it will be further elaborated in the coming sections.

Research hypotheses
On the basis of the above, the following research hypotheses were formed: H1: Teaching EFL to primary school students with the use of computer games, produces learning outcomes that are similar to those of conventional teaching methods.
H2: Students form positive attitudes regarding the use of computer games in the teaching EFL.

Sample and duration
The target group was decided to be fifth-grade primary school students (ages 10-11). That is because: (a) the literature review, as presented in the "Digital games in teaching" section, revealed few studies have been conducted regarding the use of computer games for teaching EFL at younger ages, and (b) after an initial overview of this grade's school textbooks, a number of teaching units were identified that were deemed as ideal for turning them into computer games. All of the primary schools in the city of Rhodes, Greece were conducted, as well as the teachers who teach EFL in these schools, in order to locate schools having an adequate number of computers, as well as students who met the following criteria: (a) to have never used computer games during their teaching, (b) to reflect the spread of ability in a typical mixed ability Greek fifth-grade class, and (c) the mix of genders to reflect the ratio of boys and girls in a typical Greek primary school. In Creswell's terms, the sample was achieved by selecting "ordinary", "typical", and "accessible" cases (Creswell, 2012).
As a result, the initial sample consisted of sixty-six students coming from three fifthgrade classes of three different schools. In each class, a teaching method, described in the "Procedure" section, was randomly assigned. Prior to the beginning of the project, students' parents were informed of its purpose and objectives and their written consent for the participation of their children was obtained. Also, the participating teachers were briefed. The project lasted for three weeks (from mid-February to early March 2017), for a total of fifteen two-hour sessions (five in each class).

Materials
During meetings with the project's participating teachers and a more thorough review of students' EFL textbook, five units were selected for converting them into computer games. The theme in all of them was environmental awareness: (a) Unit 1 -Pollution, (b) Unit 2 -Meet recycling bins, (c) Unit 3 -What about electronics?, (d) Unit 4 -Do you love our planet?, and (e) Unit 5 -Air pollution.
For the development of the games (one for each unit), the programming environment of Microsoft's Kodu Game Lab (https://www.kodugamelab.com/) was used. Kodu is a programming environment designed exclusively for the rapid development of 3D games. It provides a very simple icon/tile-based visual programming language, which does not require prior knowledge of programming. Furthermore, it offers a library of ready-made cartoonish objects and characters and a set of manipulation tools to build the games' landscape. On the negative side, the developers have to develop their games using only Kodu's available media, since it does not allow the import and use of external media (e.g., 3D models, images, videos, and sounds).
It is important to stress that the games were not developed by a group of experts but by the participating teachers. Although one can argue that this resulted in games being "amateurish", it was important to examine the difficulties the educators face when they have to develop their own games/teaching material to be used by their students. A 30-hour intensive course/seminar was held since the teachers did not have any previous experience in using Kodu. In addition, they had at their disposal printed and audiovisual material for guiding them. Also, they were advised to follow Gee's guidelines for designing "good" educational games (Gee, 2009;: (a) to provide simple control mechanisms, (b) the cognitive material to be clearly presented, (c) to provide compelling experiences for good learning, and (d) to allow users/learners to enact their own trajectories. Furthermore, they were asked to find ways of presenting the learning material in-game, because this was a prerequisite of the teaching methodology that was followed, as it will be further elaborated in the "Procedure" section.
It has to be noted that the teachers were able to come up with interesting ideas for overcoming the limitations imposed by Kodu (e.g., a limited number of objects and media) (Figures 1). On the other hand, it was observed that all games were, essentially, drill and practice applications. This observation will be further elaborated in the "Discussion" section.
Figures 1. Screenshots from the games Each game had a central level in which the learning material was presented and several smaller levels which were mini-games for allowing students to practice what they have learned. The texts and dialogues were exactly the same as in the school handbook. Also, all the necessary instructions, how to control the game, and what to watch out for, were written in English. This was done because it was considered important to enrich student's vocabulary and also familiarize them with the syntax of the language. Since it was important for students to listen to the pronunciation of the words and because Kodu does not allow the import of audio, video clips were recorded using Kodu and a screen capture software. In addition, these videos included the relevant theory and vocabulary. Students could access these videos after finishing playing the games. The design and development of the games and videos lasted for about four weeks (approximately 150 hours).

Procedure
As already mentioned, three groups of students participated in the study, coming from three different schools. In each group, a different teaching method was assigned. The first group was taught conventionally and only the school textbooks were used. The teacher made a short introduction regarding the unit he/she was about to teach, followed by examples and/or demonstrations (using the class's video projector). Next, students worked individually, by studying the relevant unit in the textbook and by solving the exercises. During this stage, the teacher's involvement was minimal; only when needed, he/she paused students work in order to provide guidelines and examples to the whole class. At the end of each session, the teacher and/or the students presented the solutions to the exercises and the students were asked to check whether their answers were correct. This teaching method is the prevailing one in Greek schools.
To the second group, the teaching method was based on the constructivist views for learning/teaching. After a short introduction by the teacher for the subject of each unit, students were divided into groups of four and studied the relevant unit (from the school textbook) and solved the exercises collaboratively. Discussions and the exchange of views were encouraged by the teacher, who constantly urged students to use English for communicating between themselves. The last phase of the instruction was dedicated in collaborative activities, which were designed so as to encourage the use of English (for speaking and writing). For example, in Unit 2 -Meet recycling bins, there was a team game which involved the use of cardboard for making recycling bins.
Finally, the third group of students was taught exclusively using the computer games; the school textbook was not used at all. Students were divided into pairs and each pair had at its disposal a computer, where they played the games and watched the related videos. The teacher simply provided assistance in case of technical problems. As in the previous group, discussions, and exchange of views in English were encouraged. As a result of the above, three groups of students were taught the same units, with the same duration. What differed was the teaching method.

Instruments
The main instruments used for collecting data was evaluation sheets (including a preand a delayed post-test). The pre-test was of particular interest because it is known that most students study English in private evening schools or are home tutored. Therefore, participants' knowledge level of English may considerably vary, and this could have had an impact on the study's results, leading to incorrect conclusions. The delayed post-test was administered about fifteen days after the end of the project and contained questions from all the units. Its purpose was to examine the sustainability of knowledge.
The evaluation sheets as well as the pre-and delayed post-tests, had Yes-No, fill-inthe-blanks, open-ended, and multiple-choice questions, which, for compatibility reasons, were the same (or similar) to the ones in the school textbook. It should be noted that, in addition to the above, translation of text from Greek into English and vice versa, as well as the writing of short sentences in English, were also included in the evaluation sheets.
One of the purposes of the study was to explore students' attitudes and perceptions on the use of computer games during their teaching. Thus, the second instrument that was used was a short questionnaire administered to students at the end of the project. It consisted of ten 5-point Likert-type questions (worded "Strongly Agree", "Agree", "Neutral", "Disagree" and "Strongly Disagree") and four open-ended questions. Scores were obtained by allocating numerical values to responses: "Strongly Agree" scored 5, "Agree" scored 4; "Neutral" scored 3; "Disagree" scored 2 and "Strongly Disagree" scored 1. The open-ended questions asked students to justify their views and opinions.

Results
Students who were absent in one or more sessions were excluded from the analysis. Thus, the final sample size was sixty students, divided into three groups of twenty: (a) Group 1, students in the conventional teaching method, (b) Group 2, students in the constructivist teaching, and (c) Group 3, students who used the games. For obtaining quantitative data, the evaluation sheets (including the pre-and the delayed post-test) were graded on the basis of the number of correct answers. Mean scores and standard deviations per group of participants and per evaluation sheet are presented in Table 1. One-way ANOVA tests were to be conducted to compare the scores of the three groups in all tests, in order to determine if they had any significant differences. Prior to conducting these tests, it was checked whether the assumptions of ANOVA testing were met. It was found that: (a) all groups had the same number of participants (N = 20), (b) there were no outliers, (c) with the exception of the pre-test, the data was not normally distributed in all tests, as assessed by Q-Q plots and the Shapiro-Wilk test, and (d) the homogeneity of variance was violated in most cases, as assessed by Levene's Test of Homogeneity of Variance. In the pre-test test, where all the assumptions were met, ANOVA testing was conducted. To the other evaluation sheets, the Kruskal-Wallis H test was used, which is a non-parametric test. Even though this test does not require the normal distribution of data, it assumes that the shapes of their distribution are similar in all groups (Corder & Foreman, 2009;Siegel & Castellan, 1988), as was the case in the present study. The results of these tests are presented in Table 2.
Taken together, the above results suggested that:  All groups had the same initial knowledge level, given that in the pre-test there was no statistically significant difference between them. Therefore, any differences noted in the evaluation sheets can be attributed to the different teaching methods.
 In ES1 (Pollution) and in the delayed post-test, there were no statistically significant differences between the three groups. Therefore, in these cases, the different teaching methods had no effect on the learning outcomes of students.
 In ES2 (Meet recycling bins), in ES3 (What about electronics?), and in ES4 (Do you love our planet?), while groups 1 and 3 did not have statistically significant differences, they both outperformed Group 2.
 In ES5 (Air Pollution), Group 1 outperformed Group 3, but not Group 2. Also, groups 2 and 3 did not have different learning outcomes.
 In total, Group 3 (computer games), outperformed Group 2 (constructivist teaching) in three out of six cases (including the delayed post-test), while these groups had the same results in the other three cases. Also, Group 3 did not have a statistically significant difference with Group 1 (conventional teaching) in five out of six cases.
On the basis of the above, H1 (teaching EFL with the use of computer games produces Open Journal for Educational Research, 2017, 1(1), 31-44. ______________________________________________________________________________________________ 39 learning outcomes that are similar to those of conventional teaching methods) is confirmed.
Coming to the questionnaire which was administered to students who used the computer games, their positive attitude towards them is evidenced in most of their responses (Table 3). Students liked the games (in general) a lot (M = 4.45, SD = 0.83) and expressed their desire to use games in other courses as well (M = 4.60, SD = 0.99). Students also pointed out that they faced no problems in controlling/using the games (M = 4.50, SD = 0.95) and that they helped them to learn English (M = 4.20, SD = 1.06). Their opinions regarding the various game features were also very positive (e.g., music M = 4.00, SD = 1.17; characters and graphics M = 4.05, SD = 1.00). It has to be noted that students also liked that they worked in pairs (M = 4.60, SD = 0.75).
Students' positive attitude towards the games was also evident in their responses to the open-ended questions. Some indicative responses were: "I liked all the units a lot and I also enjoyed working with [name/fellow student]". "I will never forget these lessons!" "It was nice because we were doing our lesson and at the same time we were playing".
On the basis of the above, H2 (students form positive attitudes regarding the use of computer games in the teaching EFL) was also confirmed. 4.05 1.00 The characters' animation was nice.
3.70 1.26 I did not like working with my fellow student.* 4.60 0.75 It was nice that we were playing while studying.
4.35 1.09 I think that I did not learn anything by playing these games.* 4.20 1.06 Using/controlling the games was easy.
4.50 0.95 Learning through games was easy 4.10 1.37 I would like to use games in other courses too.
4.60 0.99 Note. * = question for which its scoring was reversed

Discussion
Computer games for teaching EFL are rarely used in mainstream primary school settings. The present study contributes to the knowledge base of this still inadequately documented yet important area, by designing and implementing a project which had as a target group, 10-11-year-old students. During the design of the project, there were reservations regarding the learning outcomes of the three teaching methods, given that most students study English in private evening schools. The data in Table 1 demonstrate, quite clearly, that in all the evaluation sheets (including the pre-and the delayed post-test), students were able to achieve quite high scores, which, probably, indicates a fairly good knowledge of English. This, in turn, probably had an impact on the study's results because it did not allow the differences between groups to be very strong. Indeed, the data analysis revealed that the statistically significant differences between groups varied and that no teaching method was, clearly, better than the rest. This finding is in agreement with previous studies that reported mixed results (e.g., Perrotta et al., 2013).
On the other hand, it can be argued, with relative certainty, that computer games produced equally good (and in some cases better) learning outcomes compared to the other two methods. On the basis of this result, it can be supported that they can serve as an effective medium for teaching EFL at primary school level. This argument can be backed by students' responses to the questionnaire, given that their views were particularly positive regarding all of the project's aspects. A series reasons that have to do with the games and the teaching method that was followed, may have led to this result.
Students stated that collaboration with their fellow students had a positive impact on their learning and that cooperation with their peers was smooth. Collaboration between peers was the theoretical basis on which the whole project was based. The fact that it worked well, probably led to the active participation of students in the learning process, experimentation and in the common effort to achieve the best possible result (Tolmie et al., 2010). The fact that digital games offer a fertile ground for the exchange of information and ideas, development of cooperative activities, and that they encourage social learning, has been noted in the past (e.g., Mitchell & Savill-Smith, 2004;Sauvé, Renaud & Kaufman, 2010).
One has to be reminded that in the games group the teacher's role was minimal; students were highly autonomous and free to follow their own learning pace. Increased learning autonomy when playing educational games is also a factor which operates in parallel with students' cooperation (Fokides, 2017;Prensky, 2001a). In fact, the longer students have control on their own learning process, the better the results (McLoughlin & Marshall, 2000). So, the fact that in this project there were good learning outcomes, seems to confirm the views of researchers who believe that students with a high degree of autonomy and increased control during the learning process, can achieve positive learning results (Hong, McGee & Howard, 2000;Mayer & Moreno, 2003;Nunes, Bryant & Watson, 2009).
The introduction of computer games in the classroom did not disturb the smooth functioning of it, but, instead, it created a pleasant and fun atmosphere, although students had not previously worked in a similar way. The fun is the dominant element of educational computer games (Mawer & Stanley, 2011). This was verified and in this work, based on the responses of students to the relevant questions. In turn, the pleasant climate that was formed may have led to increased motivation for learning (Connolly et al., 2012;Malone, 1981;Malone & Lepper, 1987).
The fun and enjoyment when using the games and the interest of students were probably intensified by the fact that there was a scoring system and added bonuses as "rewards" for their correct answers. Thus, they had immediate feedback for the results of their actions, and, in a way, this encouraged them to try harder. They could also re-play the games if they wished to achieve higher scores. The element of control over the learning process, through ongoing feedback, that computer games allow, has been noted by others (e.g., Larsen et al., 2012) Students' responses to almost all of the questions regarding the games' features (e.g., music, characters, and graphics) were highly positive. Their replies demonstrated their clear preference to a different kind of a teaching environment (in contrast to conventional textbooks) and this is a strong indication of how welcome is this alternative way of teaching, as other researchers have also noted (Anyaegbu, Ting & Li, 2012;Prensky, 2007;Wrzesien, Pérez López & Alcañiz Raya, 2010). Students were able to familiarize themselves with the use of the games quite quickly and they did not experience any problems, confirming their characterization as "digital natives", which indicates their strong relationship with technology (Prensky, 2001b;Whitton, 2007).
The time needed for the development of games was about 150 hours. Although the software used is not considered particularly difficult to learn, the development of educational games by a non-specialist, proved to be a time-consuming process. One might argue that such an effort is not justified if the final outcomes are taken into consideration (Kluge & Riley, 2008). It can also be argued that because the effort was "amateuristic", the inadequacies of the games that were used (both in terms of their implementation and content), might have had a negative impact on the learning outcomes. On the other hand, there are no educational computer games that have been certified for their educational value and the integrity of their content, at least in Greece. Therefore, there is a need for collaboration between educators and computer experts for the production of such games. On the other hand, if we want teachers to become able producers of their own educational games, we need software tools that make the whole process much more agile and intelligent, while reducing the production time (Scacchi, 2012).

Conclusion
The key finding of the study was that students in the games group achieved the same learning outcomes as the other two groups. Increased interest and motivation were also noted. Although the results are interesting, the study has limitations that have to be taken into account. The sample, although sufficient for statistical analysis, was fairly limited both numerically and geographically. It is therefore quite difficult to generalize the results. Also, due to the short duration of the project, the units that were taught were also limited. The teaching of more units would have enabled a more thorough examination of the research questions. Finally, students may not have been completely honest in their responses to the questionnaire, confusing it with some form of evaluation.
It should be noted that the games were largely "amateuristic"; they were not the outcome of professionals' work. Thus, future research could use games covering more teaching units, having larger sample sizes, wider age groups, and use games that are the result of cooperation between educators and ICT specialists. In addition, future studies may utilize more research tools, such as interviews and observations, that would allow the collection of more detailed research data.
In conclusion, the need to change the way we teach English in mainstream primary school settings and the utilization of innovative teaching methods is almost self-evident. Digital games offer an interesting alternative. However, more research is needed in order to establish their exact impact in teaching.