Back to Blog
Original

Your Boss Will Soon Measure You in Tokens

Meta grades employees on AI-driven impact. Microsoft says AI is no longer optional. Google tracks productivity hours from AI tools. But a rigorous study found AI makes experienced developers 19% slower. Here is what token-based performance metrics mean for your career.

24 February 20269 min read
Your Boss Will Soon Measure You in Tokens

Your Boss Will Soon Measure You in Tokens

Last Updated: February 24, 2026

Meta now grades employees on "AI-driven impact." Microsoft told staff that using AI is "no longer optional." Google tracks weekly productivity hours generated by AI tools. The message from Big Tech is clear: your relationship with AI is becoming part of your performance review.

But here is the uncomfortable question nobody in the C-suite wants to answer: what if measuring AI usage actually makes people worse at their jobs?

What Does "Measured in Tokens" Actually Mean?

Tokens are the basic units that AI systems process. Every prompt you type, every response you receive, every line of code an AI assistant generates gets broken into tokens. One token is roughly three-quarters of a word. When companies track your AI usage, they are essentially counting how many tokens flow through your work.

Meta became the first major tech company to formalize this in early 2026, making AI-driven impact a core part of employee evaluations across all roles. Engineering managers now evaluate workers partly on their ability to use AI systems to accelerate development cycles. Microsoft followed with an internal memo stating that AI usage is "no longer optional," with GitHub CEO Thomas Dohmke calling it "fair game" to evaluate employees on whether they use Copilot. Google has been tracking increases in productivity hours created per week from engineers using AI tools, with CEO Sundar Pichai reporting a 10% boost so far.

This is not a pilot program at one adventurous startup. This is the three largest tech companies on Earth, collectively employing over 400,000 people, all converging on the same idea at the same time.

Which Companies Are Already Tracking AI Usage?

The tracking goes well beyond the tech giants themselves. A growing ecosystem of workforce analytics tools now lets any company monitor how employees interact with AI.

Worklytics tracks how much time teams spend engaging with corporate AI tools, providing aggregate-level analytics for leadership. ActivTrak and Hubstaff go further, enabling monitoring at both team and individual levels. Both can identify which company-approved AI tools workers are using, and both can flag the use of unapproved programs.

All three providers told Business Insider they have seen a sharp rise in demand for AI usage monitoring over the past two years. The market is clear: employers want visibility into who is adopting AI, who is resisting, and who is using tools they should not be.

According to BCG's "AI at Work 2025" report, more than 75% of leaders and managers use generative AI several times per week. But regular use among frontline employees has stalled at 51%. That gap is exactly what the monitoring tools aim to close.

McKinsey's 2025 workplace AI report found that 78% of organizations now use AI in at least one business function, up from 55% just two years earlier. The adoption curve is not slowing down. The measurement curve is catching up.

Does More AI Usage Actually Mean Better Performance?

Here is where the entire narrative gets complicated.

In July 2025, the nonprofit research organization METR (Model Evaluation and Threat Research) published a study that sent shockwaves through the tech industry. They paid experienced open-source developers $150 per hour to complete real coding tasks, with and without AI tools. The developers predicted AI would make them 20% faster.

The actual result: developers using AI took 19% longer to complete their tasks.

That is not a rounding error. That is a complete inversion of expectations. The study found that 9% of total task time in AI-assisted work was consumed by reviewing and fixing AI-generated code. The cognitive overhead of context-switching between writing code and managing an AI assistant ate into the productivity gains.

MIT Technology Review covered the finding in December 2025, noting it as "the most provocative" result in AI coding research. Ars Technica reported that a majority of developers in the study needed to make changes to AI-generated code before it could be used.

So if Meta is grading engineers on AI-driven impact, and the most rigorous study available shows AI can slow experienced developers down, what exactly is being measured? Usage or outcomes?

What Are the Risks of Token-Based Performance Metrics?

The risks split into three categories, and none of them are hypothetical.

Goodhart's Law on steroids. The moment a metric becomes a target, it stops being a useful metric. If employees know they are being evaluated on AI usage, the rational response is to maximize visible AI interactions regardless of whether those interactions improve work output. Prompt an AI assistant to rewrite an email you already wrote perfectly well. Ask Copilot to generate boilerplate code you could type faster yourself. The tokens flow, the dashboard lights up green, and actual productivity stays flat or declines.

The surveillance creep problem. Employers have tracked computer activity for years. But AI usage tracking introduces something qualitatively different: it measures how you think, not just what you produce. Monitoring which AI tools an employee uses, how often, and for what purposes creates an intimate map of cognitive work patterns. ActivTrak can flag "unapproved" AI tool usage, which means it can detect employees experimenting with tools their company has not sanctioned, even if those tools produce better results.

The experience penalty. The METR study specifically tested experienced developers. These are people who already know how to solve problems efficiently. Forcing them through an AI intermediary added friction. Junior developers might benefit more from AI assistance, but a blanket "use more AI" policy penalizes the very people whose expertise your organization depends on.

How Should Companies Actually Measure AI's Impact?

The companies getting this right are measuring outcomes, not inputs. The distinction matters enormously.

Outcome-based metrics that work:

  • Revenue generated per employee (tracked over time as AI tools are adopted)
  • Time-to-completion for standardized tasks (before and after AI deployment)
  • Error rates in deliverables (does AI-assisted work contain fewer mistakes?)
  • Customer satisfaction scores for AI-augmented service interactions
  • Time reclaimed and reallocated to higher-value work

Input-based metrics that backfire:

  • Number of AI tool sessions per week
  • Tokens consumed per employee
  • Percentage of code written by AI assistants
  • Frequency of AI tool logins

Google's approach of tracking "hours of productivity created" sits closer to the outcome side, though it still requires honest attribution. Meta's "AI-driven impact" framing could go either way depending on how managers interpret it.

The BCG report found that only 25% of frontline employees say they receive strong leadership support for AI adoption. That suggests the problem is not resistance. It is that companies are measuring usage before they have invested in training.

What Does This Mean for Your Career in 2026?

The trend is irreversible. Whether or not token-counting makes sense as a productivity metric, the corporate world has decided that AI adoption is a performance criterion. Adapting means three things.

First, learn to use AI tools well, not just often. Quality of AI interaction matters more than quantity. One well-crafted prompt that saves four hours of research is worth more than fifty quick prompts that generate text you rewrite anyway. The METR study's key insight was that experienced developers struggled not because AI is bad, but because they had not optimized their workflows around it.

Second, document your AI-driven outcomes. If your employer starts measuring AI usage, make sure your actual results are visible too. Saved 20 hours on a report? Built a prototype in two days instead of two weeks? Those outcomes need to be on the record, not just the token count.

Third, push back on pure usage metrics. If your organization introduces AI usage tracking without outcome measurement, advocate for both. The data supports you. A blanket "use more AI" mandate without measuring whether it actually helps is not a strategy. It is a cargo cult.

The companies that will win this transition are the ones measuring what AI enables, not just whether people are clicking the button. The employees who will thrive are the ones who can demonstrate that their AI usage creates real value, not just dashboard activity.

Your boss may soon measure you in tokens. Make sure the tokens count.


AJ Awan is an AI consultant and founder of Flowtivity, helping businesses implement AI that actually improves operations rather than just checking a compliance box. With 9+ years of consulting experience including 6 years at EY, he has delivered over $15M in business benefits across enterprise engagements.

Want AI insights for your business?

Get a free AI readiness scan and discover automation opportunities specific to your business.