To obtain AI training data, Silicon Valley giants are turning to their own employees.

To obtain AI training data, Silicon Valley giants are turning to their own employees.

The latest AI "arena" in Silicon Valley has expanded to employees' computer desktops.

According to The Information's report on May 19, tech giants such as Microsoft, Meta, and xAI are turning their employees' daily work activities into AI training data. This trend is spreading throughout the industry, and showing signs of acceleration.

Microsoft believes it possesses an asset that competitors Anthropic and Cursor do not have—about 100,000 internal software engineers. Reportedly, Microsoft is collecting developers’ programming data from its internal VSCode application, while also gathering game source code developed by Xbox game studios for AI model training. In addition, Microsoft is encouraging employees to prioritize GitHub Copilot over competing products, partly because the company tracks which Copilot-generated code is ultimately approved by engineers for use in real products.

This practice has a proprietary term in the industry: "dogfooding" (having employees use their own products first). Google and OpenAI have similar requirements.

Meta's Approach Is More Aggressive: Tracking Mouse and Keyboard

In comparison, Meta's data collection methods are deeper and more controversial.

Wallstreetcn noted thatMeta has deployed a tracking software on US employees' work computers, recording mouse movements, clicks, and keyboard actions in real-time, and periodically capturing screenshots. This tool is named "Model Capability Initiative" (MCI), covering work-related applications and websites.

Why does Meta collect such data? The reason is that current AI models still have obvious shortcomings at simulating the details of human-computer interaction—for example, selecting options from drop-down menus and using keyboard shortcuts, which AI currently struggles to replicate naturally. An internal Meta memo wrote: "This is exactly where every Meta employee can help improve the model through their daily work."

Meta spokesperson Andy Stone commented: "If we're building intelligent agents to help people with everyday computer tasks, the model needs real human operation samples—like mouse moves, button clicks, navigation through dropdown menus." He also noted that the data wouldn't be used for employee performance evaluation and that safeguards are in place to block "sensitive content," though he didn't specify which data would be excluded.

Meta CEO Mark Zuckerberg personally told employees that collecting their data is "especially valuable," because Meta’s employees are all "very smart."

Employee Resistance: The Software Makes Computers "Super Slow"

This initiative has encountered evident resistance within Meta.

Some employees choose to ignore the pop-up permission window on their screens, refusing to click the "Accept" button. Others share methods to disable the software through device settings in internal posts.

The resistance isn't groundless. Multiple employees commented in internal posts that the MCI software causes technical issues after installation. One employee wrote: "MCI makes everything super slow," with noticeable lag in keyboard input and mouse movement.

xAI: Paying Employees for Tax Forms

xAI’s approach heads in another direction—direct payments.

Reportedly, xAI management proposed to employees that each person could receive $420 to "donate" their own and relatives' tax returns as AI training data for Grok.

However, two months have passed, and xAI still has not paid participating employees.

Why Focus on Their Own Employees?

Tech companies have multiple ways to obtain AI training data, including requesting permission from users to use their data. But as The Information analyzed, collecting data from employees is more convenient for a simple reason: technically, employees cannot say “no.”

This logic is particularly clear in Meta’s strategic framework. According to Reuters, Meta CTO Andrew Bosworth described the company's goal in an internal memo: "The future we're building is one where intelligent agents primarily handle the work, and our role is to direct, review, and help them improve." This strategic project has since been renamed the "Agent Transformation Accelerator" (ATA).

In other words, Meta’s logic is: first train AI with employees’ operation data, then gradually use AI to replace employees’ work.

The legal situation for this kind of monitoring varies greatly by region.

Yale Law Professor Ifeoma Ajunwa noted that computer logs and screenshots were previously mainly used to investigate employee misconduct, but full recording of keyboard actions means white-collar workers now endure real-time monitoring intensity previously limited to delivery drivers and gig workers. "At the federal level in the US, there are no restrictions on employee monitoring," Ajunwa said, with state laws only requiring broad notification from employers.

The situation in Europe is completely different. York University (Toronto) Law Professor Valerio De Stefano said such monitoring is likely illegal in Europe. In Italy, electronic monitoring to track employee productivity is explicitly prohibited; in Germany, courts ruled employers can only use keyboard-logging software in cases of suspected serious criminal offenses. De Stefano also believes the practice may violate the EU's General Data Protection Regulation (GDPR).

De Stefano further pointed out that increased awareness of employer monitoring is changing workplace power dynamics more broadly, tipping the balance even further toward employers.

Risk Warnings and DisclaimerThe market has risks, investment needs caution. This article does not constitute personal investment advice, nor does it take into account the unique investment goals, financial situation, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article are suitable for their specific circumstances. Investing based on this is at your own risk.