Factlen ExplainerEdTechExplainerJun 19, 2026, 1:08 AM· 6 min read· #2 of 2 in education

How AI Tutors Are Finally Cracking the 'Two Sigma' Problem in Online Learning

Generative AI is democratizing personalized education, bringing the massive learning gains of one-on-one tutoring to millions of students at the cost of computation.

By Factlen Editorial Team

EdTech Optimists 40%Pedagogical Realists 35%Classroom Educators 25%
EdTech Optimists
Believe AI will fully democratize one-on-one tutoring and solve the Two Sigma problem.
Pedagogical Realists
Argue human connection is required and Bloom's original two-sigma claim is often exaggerated.
Classroom Educators
Focus on AI as a co-pilot to reduce administrative burden and assist with active learning.

What's not represented

  • · Students from regions without reliable internet access
  • · Data privacy advocates

Why this matters

Access to personalized tutoring has historically been a privilege of the wealthy. By scaling adaptive, one-on-one instruction through AI, education systems can close learning gaps and dramatically improve outcomes for students regardless of their socioeconomic background.

Key points

  • Benjamin Bloom's 1984 research showed 1-to-1 tutoring improves student performance by two standard deviations.
  • Generative AI is now allowing education systems to scale personalized tutoring at a fraction of the cost.
  • A 2025 randomized controlled trial found AI tutors outperformed traditional active learning environments.
  • AI tutors act as co-pilots for teachers, providing real-time analytics on student comprehension.
0.73–1.3 SD
Learning gain from AI tutor vs active learning
49 mins
Median time-on-task for AI tutored students
~20%
Greater-than-expected learning gains with Khanmigo
6.1%
Improvement in next-item correctness after 2026 AI update

For decades, educators have chased a holy grail: the ability to provide every student with a personalized, one-on-one tutor. The pursuit stems from a landmark 1984 paper by educational psychologist Benjamin Bloom, who identified what he called the 'Two Sigma Problem.' Bloom's research demonstrated that students who received one-on-one tutoring using mastery learning techniques performed two standard deviations—or two 'sigmas'—better than students in conventional classrooms. In practical terms, this meant the average tutored student outperformed 98 percent of their conventionally taught peers, effectively turning a 'C' student into an 'A' student.[4][6]

The 'problem' in Bloom's Two Sigma Problem was never the efficacy of the method, but its scalability. Providing a dedicated human tutor for every single student is prohibitively expensive and logistically impossible for public education systems globally. For forty years, the challenge has been to find group instruction methods or scalable technologies that could replicate the massive learning gains of personalized tutoring without the astronomical costs. Until recently, most digital learning tools—from early educational software to massive open online courses—fell drastically short of this benchmark, offering static content rather than adaptive guidance.[6]

The landscape shifted dramatically with the advent of advanced generative artificial intelligence. Unlike previous generations of educational software that relied on pre-programmed decision trees, modern conversational AI tutors can engage learners in natural language dialogue. They do not simply dispense answers; instead, they employ the Socratic method, breaking down complex problems into component parts and gently guiding students to discover the solutions themselves. This shift from passive software to interactive, conversational agents has reignited hopes that the Two Sigma barrier might finally be breached.[6]

Bloom's 1984 research showed that 1-to-1 tutoring could shift student performance by two standard deviations.
Bloom's 1984 research showed that 1-to-1 tutoring could shift student performance by two standard deviations.

Recent empirical evidence suggests these AI systems are making unprecedented strides. In a peer-reviewed randomized controlled trial published in Scientific Reports in June 2025, researchers tested a carefully designed AI tutor against highly rated in-class active learning in an undergraduate physics course. The results were striking: the AI tutor outperformed the traditional active learning environment with an effect size between 0.73 and 1.3 standard deviations. This represents one of the strongest experimental validations to date of AI's pedagogical potential.[2]

Beyond raw test scores, the 2025 study highlighted a significant gain in efficiency. Students using the AI tutor achieved their superior post-test scores in less time, with a median time on task of 49 minutes compared to 60 minutes for the in-class learners. By managing cognitive load and providing immediate, tailored feedback, the AI system allowed students to bypass the friction of waiting for a teacher's attention or struggling silently with a misconception.[2]

A 2025 randomized controlled trial found that AI tutoring outperformed traditional active learning in both efficacy and speed.
A 2025 randomized controlled trial found that AI tutoring outperformed traditional active learning in both efficacy and speed.

The mechanism driving these gains relies on a technique known as Retrieval-Augmented Generation, which anchors the AI's responses to a curated curriculum rather than the open internet. When a student submits an answer, the AI evaluates the specific mathematical or logical error, retrieves the relevant pedagogical strategy, and generates a custom prompt. If a student struggles with a quadratic equation, the AI does not solve it; it asks the student to identify the coefficients, scaffolding the problem step-by-step just as a human expert would.[6]

In the K-12 sector, the most prominent real-world application of this technology is Khanmigo, an AI-powered tutor developed by the nonprofit Khan Academy. Built in partnership with OpenAI, Khanmigo has been piloted in hundreds of school districts across the United States. According to a massive efficacy study involving approximately 350,000 students during the 2022-2023 school year, students who used the platform for 30 or more minutes per week experienced roughly 20 percent greater-than-expected learning gains on the nationally normed MAP Growth Assessment.[3]

In the K-12 sector, the most prominent real-world application of this technology is Khanmigo, an AI-powered tutor developed by the nonprofit Khan Academy.

The development of these tools is highly iterative, relying on massive datasets of student interactions to refine the AI's pedagogical instincts. In May 2026, Khan Academy released data from a six-month product testing phase aimed at reducing latency and improving accuracy. By giving the AI agent access to structured signals from a student's historical learning record—such as their recent performance patterns and specific skill gaps—developers achieved a 6.1 percent improvement in 'next-item correctness,' a metric measuring whether a student correctly answers the subsequent problem after receiving tutoring.[3]

Modern AI tutors use Retrieval-Augmented Generation to scaffold problems rather than simply giving away the answers.
Modern AI tutors use Retrieval-Augmented Generation to scaffold problems rather than simply giving away the answers.

Crucially, the integration of AI tutors is not rendering human teachers obsolete; rather, it is shifting their role from lecturers to instructional coaches. A December 2024 CBS News report from Hobart High School in Indiana detailed how educators are utilizing AI assistants to reclaim hours previously spent on administrative tasks. Teachers use the platform to generate differentiated lesson plans in minutes, allowing them to spend class time facilitating complex labs and providing emotional support to students who need human connection.[1]

Furthermore, AI dashboards provide teachers with real-time analytics on student comprehension. If an AI tutor detects that five students in a classroom of thirty are repeatedly stumbling over the concept of covalent bonds, it flags this cluster for the teacher. The educator can then pull those specific students aside for a targeted mini-lesson, seamlessly blending the scale of machine tutoring with the intuition and empathy of human instruction.[1]

Despite the optimism, pedagogical realists caution against accepting the 'Two Sigma' narrative uncritically. Industry analysts, such as Glenda Morgan of Phil Hill and Associates, have pointed out that the tech sector often oversimplifies Bloom's original research to market new hardware and software. Modern meta-analyses of human tutoring reveal that the average effect size is closer to 0.37 standard deviations, not the mythical 2.0. While the gains from AI are impressive, expecting a literal two-sigma leap across all subjects and demographics may set unrealistic expectations.[4]

AI dashboards allow teachers to monitor real-time comprehension and intervene when specific students struggle.
AI dashboards allow teachers to monitor real-time comprehension and intervene when specific students struggle.

There are also inherent risks in deploying generative AI in educational settings. The most pressing concern is the potential for students to over-rely on the technology, using it as a sophisticated answer key rather than a learning partner. If guardrails fail and the AI simply provides solutions, it circumvents the productive struggle that is essential for deep cognitive retention. Developers must continuously tune these systems to resist student prompts designed to extract direct answers.[5][6]

Additionally, researchers emphasize that effective tutoring is not merely an information-transfer problem; it is a deeply social and emotional endeavor. A February 2026 review published on arXiv highlighted that while conversational AI can simulate dialogue, it currently struggles to support students' metacognitive skills—the ability to reflect on their own learning processes. To reach maximum efficacy, future AI systems must better replicate the motivational qualities of human tutors who can read a student's frustration and offer encouragement.[5]

We are currently witnessing what technologists call the 'knee in the curve' of educational AI. Each interaction generates data that refines the underlying models, creating a positive feedback loop where the tutors become incrementally more effective every day. Unlike human pedagogical improvements, which spread slowly through professional development seminars, a breakthrough in an AI's ability to explain fractions is instantly deployed to millions of students worldwide.[6]

Ultimately, whether AI perfectly achieves Bloom's two-sigma benchmark may be beside the point. By lowering the cost of personalized, adaptive instruction to the mere cost of computing power, these systems are democratizing a level of educational support that was previously reserved for the wealthy. As these tools mature, they promise to fundamentally reshape the architecture of online learning, making the dream of a dedicated tutor for every student a tangible reality.[6]

How we got here

  1. 1984

    Benjamin Bloom publishes his paper on the 'Two Sigma Problem,' demonstrating the massive efficacy of 1-to-1 tutoring.

  2. 2023

    Khan Academy launches Khanmigo, one of the first generative AI tutors built on OpenAI's advanced models.

  3. Dec 2024

    Efficacy studies reveal students using Khan Academy 30+ minutes a week achieve 20% greater learning gains.

  4. Jun 2025

    A landmark randomized controlled trial in Scientific Reports finds AI tutors outperform traditional active learning.

  5. May 2026

    Khan Academy releases data showing continuous, measurable improvements in AI tutoring effectiveness at scale.

Viewpoints in depth

EdTech Optimists

Believe AI will fully democratize one-on-one tutoring and solve the Two Sigma problem.

This camp views generative AI as the most significant educational breakthrough since the printing press. They argue that by lowering the cost of personalized instruction to the cost of computation, AI tutors will eradicate socioeconomic learning gaps. Proponents point to recent randomized controlled trials showing AI outperforming traditional active learning, arguing that as the models ingest more data, their pedagogical instincts will eventually surpass those of average human tutors.

Pedagogical Realists

Argue human connection is required and Bloom's original two-sigma claim is often exaggerated.

Realists caution against the tech industry's uncritical embrace of Bloom's 1984 research, noting that modern meta-analyses show human tutoring yields much smaller, though still significant, gains. They emphasize that learning is a deeply social and emotional process. From this perspective, an AI can scaffold a math problem, but it cannot read a student's body language, offer genuine empathy, or inspire a lifelong passion for a subject the way a human mentor can.

Classroom Educators

Focus on AI as a co-pilot to reduce administrative burden and assist with active learning.

For teachers on the ground, the debate over 'two sigmas' is secondary to practical utility. Educators value AI primarily as a tool that reclaims their time. By automating lesson planning, grading, and basic remediation, AI allows teachers to step away from the whiteboard and spend more time engaging directly with students who need complex interventions or emotional support. They view AI not as a replacement, but as a highly capable teaching assistant.

What we don't know

  • Whether the massive learning gains observed in short-term AI tutoring trials will persist over multi-year educational arcs.
  • How effectively AI tutors can support students with severe learning disabilities or neurodivergent needs.
  • The long-term impact of conversational AI on students' social development and peer-to-peer collaboration skills.

Key terms

Bloom's Two Sigma Problem
The educational challenge of finding scalable group instruction methods that match the massive effectiveness (two standard deviations) of one-on-one tutoring.
Mastery Learning
An instructional approach where students must demonstrate proficiency in a topic before advancing to new, more complex material.
Retrieval-Augmented Generation (RAG)
An AI technique that grounds a model's responses in a specific, trusted database (like a curriculum) to prevent it from making up false information.
Socratic Method
A form of cooperative argumentative dialogue that stimulates critical thinking by asking and answering questions, rather than just providing facts.
Metacognition
Awareness and understanding of one's own thought processes; learning how to learn.

Frequently asked

Will AI tutors replace human teachers?

No. Evidence shows AI tutors are most effective when used as a 'co-pilot' that handles routine instruction and grading, freeing teachers to focus on complex problem-solving and emotional support.

Does the AI just give students the answers?

Well-designed AI tutors use the Socratic method. Instead of providing direct answers, they ask guiding questions to help students discover the solution themselves.

Is the 'Two Sigma' claim actually realistic?

Some analysts argue the original two-sigma claim is exaggerated, noting that human tutoring averages a 0.37 standard deviation improvement. However, recent AI trials have shown gains between 0.73 and 1.3 standard deviations.

How much does AI tutoring cost?

While human tutors can cost hundreds of dollars an hour, AI tutoring aims to lower the cost of personalized learning to the mere cost of computing power, often provided free or at low cost by school districts.

Sources

Source coverage

6 outlets

3 viewpoints surfaced

EdTech Optimists 40%Pedagogical Realists 35%Classroom Educators 25%
  1. [1]CBS NewsClassroom Educators

    How AI tutor Khanmigo is changing the classroom

    Read on CBS News
  2. [2]Scientific ReportsEdTech Optimists

    AI tutoring outperforms in-class active learning: an RCT introducing a novel research-based design

    Read on Scientific Reports
  3. [3]Khan AcademyEdTech Optimists

    Latest Efficacy Study Results: Khan Academy and Khanmigo

    Read on Khan Academy
  4. [4]OnlineEducation.comPedagogical Realists

    The AI Industry's Misinterpretation of Bloom's Two Sigma Problem

    Read on OnlineEducation.com
  5. [5]arXivPedagogical Realists

    Conversational AI Tutoring: Systems, Limitations, and Future Directions

    Read on arXiv
  6. [6]Factlen Editorial Team

    Synthesis by Factlen editorial team

    Read on Factlen Editorial Team
Stay informed

Every angle. Every day.

Get education stories with full source coverage and perspective breakdowns delivered to your inbox.

How AI Tutors Are Finally Cracking the 'Two Sigma' Problem in Online Learning | Factlen