In the high-stakes race to dominate the AI market, safety is quickly becoming a secondary concern, according to employees on the front lines. AI trainers, the very people tasked with ensuring models are safe and reliable, report that their companies are cutting corners, loosening safety protocols, and prioritizing speed above all else, creating a product they no longer trust.
One of the most alarming changes reported by workers is a shift in how AI handles harmful content. A new policy allows the model to repeat hate speech, stereotypes, and even pornographic material, provided the user prompted it first. This creates a gray area where the AI becomes a tool for amplifying toxicity, a significant departure from previous, stricter guardrails that prohibited such language entirely.
This “safety last” approach extends to the accuracy of information. Raters are being pushed to evaluate topics far outside their expertise, from complex medical procedures to advanced mathematics. When a worker isn’t qualified, they are told to simply rate the parts they understand. This practice systematically embeds unverified and potentially incorrect information into the AI’s knowledge base, all to keep the production line moving.
The public blunders of AI, such as suggesting glue in pizza, are seen by these insiders as inevitable outcomes of this flawed process. They witness the “crazy stuff” the models generate daily and know that the pressure for quantity over quality means more errors will slip through. Their experience serves as a warning that the industry’s obsession with progress is coming at the cost of public safety.
