Google is using Anthropic’s Claude to improve its Gemini AI

Google’s use of Anthropic’s Claude to improve its Gemini AI underscores the intense competition in the AI industry. According to a report by TechCrunch, contractors working on Gemini are tasked with comparing its responses to those of competitor models like Claude. This manual evaluation process includes scoring responses on criteria such as truthfulness and verbosity, sometimes taking up to 30 minutes per prompt.

Key Points:

Comparison Testing with Claude:
- Contractors observed explicit references to Claude during the evaluation process, with one response even stating: “I am Claude, created by Anthropic.”
- Contractors noted that Claude’s responses prioritize safety more than Gemini’s, often declining to answer prompts it deems unsafe.
Safety Concerns:
- Claude’s strict safety measures, such as avoiding unsafe role-play scenarios, contrasted with Gemini’s responses, which included flagged safety violations.
- Gemini’s outputs reportedly included inappropriate content in some cases, raising further concerns about its safeguards.
Ethical and Legal Questions:
- Anthropic’s terms of service prohibit using Claude to build or train competing AI models without explicit approval.
- While Google states it does not use Claude to train Gemini, it remains unclear whether Google obtained permission to use Claude for testing purposes.
Challenges in Evaluation:
- Contractors have raised concerns about being asked to evaluate Gemini’s responses on specialized topics, such as healthcare, without adequate expertise, risking inaccuracies on sensitive subjects.
Industry Practices:
- Comparing AI model outputs is common in the tech industry. However, the ethics and legality of using direct competitor models, especially without clear permission, remain contentious.

Google’s investment in Anthropic further complicates the dynamics, as any perceived misuse of Claude might strain the relationship between the two companies. With safety and accuracy increasingly under scrutiny, this development highlights the challenges of balancing innovation with ethical AI development practices.