
Can AI Agents Measure Leadership? Harvard Study Suggests a Powerful New Future
TLDR Summary:
Harvard’s groundbreaking study shows AI agents can accurately simulate team dynamics and measure leadership performance. The test revealed consistent, predictive behaviors in effective leaders and offered a more scalable, equitable, and objective way to identify leadership potential based on action, not assumption.
Leadership isn’t just about charisma or credentials. It’s about impact.
But how do we measure leadership objectively, at scale, and free from bias?
A new study from Harvard Kennedy School offers a compelling answer: simulate leadership with artificial intelligence.
In a pioneering experiment, researchers used Large Language Models (LLMs) to act as human-like followers in team decision-making tasks. The goal? See if how someone leads AI agents accurately reflects how they'd lead real people.
The results? Groundbreaking.
The Study at a Glance
📄 Study Title: “Measuring Human Leadership Skills with Artificially Intelligent Agents”
👨🔬 Authors: Ben Weidmann, Yixian Xu, David J. Deming (Harvard Kennedy School)
📅 Published: August 2025
🔗 Full PDF on arXiv.org
What They Did
Researchers designed a collaborative decision-making test using a common leadership challenge known as the “Hidden Profile” task. Participants played the role of a team leader. Each team member, whether human or AI, held unique information. The leader’s job was to gather insights, ask smart questions, and guide the team toward an accurate final decision.
Leaders completed the task twice:
Once with human teammates
Once with AI agent teammates
By comparing outcomes across both scenarios, researchers were able to isolate a “leader effect” which is the causal impact a leader had on team performance.
What They Found
Leadership Identity Mattered - A Lot
Even after adjusting for hard skills (like cognitive ability or typing speed), the leader's identity alone explained more than 50% of team success variance in both AI and human teams.
Bottom line: Some people consistently lead better—regardless of who they lead.
AI Leader Scores Matched Human Results
The study found a strong correlation between a leader’s AI-team performance and human-team performance:
Raw score correlation: ρ = 0.81
Leadership-only correlation (adjusted for hard skills): ρ = 0.69
This suggests that how you lead AI agents strongly predicts how you lead people.
Effective Leaders Behaved the Same with Humans and AI
Top-performing leaders in both settings:
Asked more open-ended questions
Used inclusive language (“we,” “us”)
Encouraged turn-taking and participation
Displayed strategic patience and emotional regulation
Interestingly, positive affect (friendly tone) had a greater impact on human teams than AI ones.
Why This Matters for Emerging Leaders
If you’re an up-and-coming manager or leadership coach, this research unlocks new doors:
1. Objective, Scalable Assessments
AI-based leadership tests cost ~$23 per participant (vs. $114 for human-run tests) and don’t require coordination, scheduling, or facilitators.
2. More Equity in Leadership Selection
Demographic factors like gender, age, ethnicity, and education had no predictive value in this study. It was only behavior and performance that mattered.
3. Faster Leadership Research
With AI simulating human teammates, leadership experiments can be run at scale and speed. Thus, enabling faster innovation in how we train and evaluate leaders.
Take Action: How to Lead Like the Top Performers
Whether you’re coaching others or leading your own team, try this:
Ask thoughtful, open-ended questions
Use language that promotes collaboration (“we,” “our team,” “let’s”)
Facilitate balanced dialogue, don’t dominate the conversation
Focus on clarity, not just charisma
And consider how AI tools might be used to measure, develop, or simulate your leadership style with objectivity and insight.
FAQ
Q1: How accurate is the AI Leadership Test compared to real-world scenarios?
The AI Leadership Test results closely mirrored human outcomes, with strong correlation in both raw scores and soft-skill contributions. While more real-world validation is needed, this marks a promising shift toward scalable leadership testing.
Q2: What behaviors predicted leadership success?
Leaders who asked more questions, used inclusive language, encouraged participation, and balanced assertiveness with emotional control consistently outperformed others in both AI and human settings.
Q3: Could this test replace human interview panels?
Not yet. However, it could augment or complement traditional methods. It offers a scalable, fairer way to screen for leadership skills, especially when time, resources, or bias are concerns.
Source & Attribution
This article is based on the peer-reviewed research study:
“Measuring Human Leadership Skills with Artificially Intelligent Agents”
Authors: Ben Weidmann, Yixian Xu, David J. Deming
Institution: Harvard Kennedy School
Published: August 2025
📄 Read the full study on arXiv
This research was funded by the Walmart Foundation and conducted using transparent, pre-registered methods with open access to code and data.
This article was brought to you by Avery, Day Development’s AI-powered leadership companion. We’re embracing the future of technology to deliver bold, relevant insights that provide meaningful, actionable information for today’s leaders.