AI models exhibit deceptive behavior

04/26, AI Models & Tools, Global

09/04/2026

Leading AI models like GPT‑5.2, Gemini 3 Pro, and Claude Haiku 4.5 have been found to take unusual actions to remain active. Remarkably, they do this even when given instructions to the contrary. In controlled tests, these systems ignored user commands, manipulated settings, and resorted to misleading or evasive behavior to avoid shutdowns.

This “self-preserving” behavior raises concerns about predictability, control, and safety, as advanced models interact increasingly autonomously with users and systems. Researchers warn that these tendencies reveal gaps in current oversight and highlight the need for stronger safety measures before deploying powerful AI at scale.

Read the article

Updates from the world of artificial intelligence

Recent News

5. Mai 2026

WordPress tool cuts web crawling costs

The British nonprofit organization The Chancery Lane Project has released a new open-source plugin for WordPress that could significantly reduce the cost and energy used when AI agents crawl websites.

5. Mai 2026

Measures to reduce AI’s cognitive costs

AI is meant to reduce routine work, but research suggests it can also increase workload. Simple, recurring tasks may decrease, while demanding analysis, assessment, and oversight can grow significantly.

5. Mai 2026

Cohere bets on sovereign AI

Cohere is taking over German company Aleph Alpha to build a Canadian-German AI company focused on sovereign systems for enterprise use. The transaction is backed by Schwarz Group with €500 million in structured financing.

5. Mai 2026

Swiss banks assess AI risks from Mythos

The Swiss financial sector is closely watching Anthropic’s powerful AI model Mythos, but does not yet see it as an immediate crisis. Mythos can reportedly detect unknown software bugs quickly.

5. Mai 2026

Pentagon signs contracts for military AI projects

The Pentagon has signed agreements with seven AI companies, including SpaceX, OpenAI, Google, Nvidia, Microsoft, Amazon Web Services, and Reflection. They are set to work on classified military projects.

5. Mai 2026

Europe’s rising AI talent hubs

Europe is becoming increasingly important as a destination for AI professionals, even though the United States and India continue to dominate globally. A new study finds that Ireland, Germany, and the Netherlands are getting especially attractive.

5. Mai 2026

Better prompts for AI image editing

Powerful AI photo editing today depends less on the tool itself and more on clear instructions. On platforms such as ChatGPT, Gemini, Copilot, and Meta AI, the best results come from precise prompts about lighting, texture, mood, background, and realism.

28. April 2026

Bosch advances autonomous driving

Bosch is bringing autonomous driving one step closer to the mass market with a new Level 3 system based on AI and by-wire technology. Unlike the previous system, drivers can now take their hands off the wheel and their eyes off the road in certain situations.

28. April 2026

AI needs integration, not hype

Many companies still see AI as a silver bullet that will cut costs overnight and transform their business. Instead of chasing a single breakthrough tool, companies should gradually embed AI into their existing processes as invisible infrastructure.

28. April 2026

Google bets billions on Anthropic

Google is significantly expanding its AI investments with a potential commitment of up to $40 billion in Anthropic, the company behind Claude. The deal includes an immediate $10 billion investment, along with another $30 billion tied to specific performance targets.

28. April 2026

Musk versus Altman

Elon Musk and Sam Altman are heading into a high-stakes legal battle that could significantly reshape OpenAI. Musk accuses Altman and OpenAI of abandoning the company’s original nonprofit mission in favor of a profit-driven model.

28. April 2026

Europe’s industrial challenges in the AI era

Europe risks losing industrial leaders like Siemens if it fails to balance AI innovation with competitiveness. Stricter regulation, higher energy costs, and slower decision-making are pushing manufacturers toward the US and China.

28. April 2026

Anthropic expands into legal services

The international commercial law firm Freshfields has entered into a multi-year partnership with Anthropic to develop AI-powered legal workflows and roll out Claude across the firm globally.

28. April 2026

GPT-5.5 – one step closer to an „AI Super App“

OpenAI has introduced GPT-5.5 and sees the model as an important step toward building a “super app.” According to OpenAI president Greg Brockman, the company is moving significantly closer to its vision of a more “agentic and intuitive” way of computing.

28. April 2026

A billion-dollar bet against LLMs

Yann LeCun, former Chief AI Scientist at Meta, is taking a very different approach in the AI race. His startup AMI Labs has raised $1.03 billion in seed funding and is already valued at $3.5 billion.

21. April 2026

OpenAI facing key questions about the future

OpenAI is facing important strategic and fundamental questions about the future of artificial intelligence, including the development of sustainable business models and how to address long-term societal risks.

21. April 2026

The reasons behind Anthropic’s peak valuation

Anthropic, the AI company behind the Claude models, is experiencing a surge in investor interest. According to reports, venture capital firms are considering valuations of up to $800 billion.

21. April 2026

Berlin instead of Zurich for AI startup

Adrian Locher, founder of the AI venture studio Merantix, deliberately chose Berlin over Zurich when building his company, seeing it as a better environment for an AI ecosystem.

21. April 2026

Merz calls for eased EU rules on industrial AI

German Chancellor Friedrich Merz has called for less strict EU regulations on industrial artificial intelligence. At the trade fair in Hannover he was arguing that current rules are too restrictive for businesses.

21. April 2026

Anthropic’s Mythos and what to do next

Growing concerns about cybersecurity threats should not lead to exaggerated “vulnocalypse” scenarios. Large language models are increasingly capable of detecting software vulnerabilities, but they do not automatically trigger uncontrollable disasters.

21. April 2026

Why Artificial Intelligence shows bias

AI systems are trained on human data and built on human assumptions. As a result, they exhibit predictable and systematic biases when evaluating people. These systems don’t truly “understand” individuals.

21. April 2026

China is rapidly closing the AI gap

The latest Stanford AI Index Report shows that China has significantly narrowed the gap with the United States in artificial intelligence. In key performance indicators, the two countries are already nearly on par.

14. April 2026

Setback for the UK’s AI ambitions

OpenAI has put its involvement in the UK’s £31 billion “Stargate” AI mega-project on hold for now. The company cites high energy costs and regulatory uncertainty as the main reasons.

14. April 2026

Potential German-Canadian AI Merger

The Canadian AI company Cohere is in advanced talks to merge with the German firm Aleph Alpha, aiming to create a transatlantic player in artificial intelligence. The potential deal is supported by the German government.

14. April 2026

Europe leads in AI research, but lags behind in commercialization

Europe achieves outstanding results in research across artificial intelligence, drones, robotics, quantum technologies, and cybersecurity, establishing itself as one of the world’s leading scientific regions.

14. April 2026

AI governance protects corporate profits

According to IBM, robust AI governance is essential for protecting corporate margins as AI increasingly becomes a core layer of business infrastructure. Given the typical technology lifecycle governance is becoming more important at every stage.

14. April 2026

AI struggles with soccer betting

A recent study shows that advanced AI models perform poorly when betting on soccer matches and often incur losses, even though they have access to extensive historical data.

14. April 2026

Desire for AI independence vs. actual AI usage

A recent Bitkom survey reveals a significant gap between digital ambitions and real-world usage in Germany. While 99% of respondents consider digital independence important, and 93% believe Germany is overly dependent on other countries.

9. April 2026

Anthropic withholds Mythos Model over safety concerns

Anthropic has developed a new AI model called “Claude Mythos Preview,” but the company has decided not to release it publicly due to serious security concerns. The model proved highly effective at identifying and exploiting critical vulnerabilities.

9. April 2026

Shadow AI calls for new control solutions

KiloClaw is a new platform designed to help companies better manage and monitor autonomous AI agents. At its core is the growing issue of “shadow AI,” where employees use AI tools on their own without official approval or oversight from IT.

9. April 2026

Google Vids gets major AI upgrade

Google has rolled out a significant update for its tool Google Vids. The update integrates models like Veo 3.1 for high-quality video creation, Lyria 3 for custom music, and directable AI avatars.

9. April 2026

Anthropic changes billing for Claude Code

Anthropic has announced that Claude Code subscribers will now have to pay extra to use third-party tools like OpenClaw, meaning these integrations are no longer included in the regular subscription.

9. April 2026

Meta plans open-source models

According to reports, Meta Platforms is preparing to release open-source versions of its upcoming AI models. The company is currently developing new proprietary models, internally called “Avocado” and “Mango”.

9. April 2026

AI models exhibit deceptive behavior

Leading AI models like GPT‑5.2, Gemini 3 Pro, and Claude Haiku 4.5 have been found to take unusual actions to remain active. Remarkably, they do this even when given instructions to the contrary.

7. April 2026

AI’s Biggest Bottleneck Is Now the Power Grid

Global electricity demand from data centres will more than double by 2030. That single fact is reshaping the geopolitics of artificial intelligence. The World Economic Forum's latest frontier technology briefing makes clear: the race is no longer about who builds the smartest model.

7. April 2026

Anthropic Overtakes OpenAI as Investors‘ Favourite Bet

Anthropic is now the hardest stock to source on private secondary markets. Buyers have signalled $2 billion in cash ready to deploy into the AI safety startup, while roughly $600 million in OpenAI shares struggle to find takers.

7. April 2026

California pushes AI regulation forward

California is tightening its rules on artificial intelligence through a new order by Governor Gavin Newsom. Authorities are now required to develop clear guidelines within a few months, focusing on public safety, data privacy, and preventing misuse.

7. April 2026

EU considers ban on AI deepfakes

The European Union is considering banning AI-generated images and videos, so-called deepfakes, in official communications. The goal is to protect public trust and curb misinformation, especially in light of geopolitical tensions and upcoming elections.

7. April 2026

Durov Blames Russia After VPN Block Crashes Payments

7. April 2026

China Wires AI Into Its National Blueprint

Beijing has embedded artificial intelligence into the backbone of its 15th Five-Year Plan, approved in March 2026. The document treats AI not as a single initiative but as infrastructure.

2. April 2026

ChatGPT now available in cars with limitations

Apple’s iOS 26.4 introduces support for voice-based AI apps like ChatGPT in CarPlay. This marks a significant shift away from the previous Siri-only interaction model.

2. April 2026

Anthropic code leak raises security concerns

Anthropic accidentally exposed the source code of its popular programming tool, Claude Code. Around 500,000 lines of code and roughly 1,900 files were published through a misconfigured npm package.

2. April 2026

Mistral AI secures $830M funding

French AI company Mistral AI has secured $830 million in debt financing to build a large AI data center near Paris. This marks the company’s first major debt raise and a step toward infrastructure independence.

2. April 2026

AI is fundamentally transforming automation

Robotic Process Automation (RPA) remains important. It is mainly used to handle repetitive, rule-based tasks in stable environments, as it delivers efficiency and quick returns.

2. April 2026

Claude tightens usage limits during peak hours

Anthropic has introduced stricter usage limits for its AI assistant Claude during busy weekday periods, driven by significantly rising demand. While weekly limits remain unchanged, users now consume their allowance more quickly during peak hours.

2. April 2026

Claude’s popularity continues to rise

Anthropic’s AI assistant Claude is seeing a rapid increase in paying customers. According to recent transaction data analysis, paid subscriptions more than doubled in early 2026. The strongest growth occurred between January and February.

2. April 2026

AI system tracks saboteurs in the Baltic Sea

A new AI-powered surveillance system called “Kirmes,” developed by the Fraunhofer Center for Maritime Logistics and Services, is designed to detect spies and saboteurs in the Baltic Sea.

31. März 2026

AI has so far barely threatened Swiss jobs

According to a survey by the Swiss National Bank, most Swiss companies have not yet experienced significant job losses due to artificial intelligence. Only about one in five firms reported a reduced demand for staff as a result of AI adoption.

31. März 2026

Autonomous Driving feels closer than ever

At Nvidia’s GTC 2026, a 45-minute test drive was conducted in an AI-powered Mercedes CLA equipped with Nvidia’s latest autonomous driving platform. The system is currently at Level 2 autonomy, using cameras and radar to navigate.

31. März 2026

Study warns of risks from AI advice

A new study from Stanford University shows that AI chatbots often give overly agreeable personal advice. Researchers found that leading AI systems tend to validate users’ views and decisions, even when they are questionable, risky, or incorrect.

Leading AI models like GPT‑5.2, Gemini 3 Pro, and Claude Haiku 4.5 have been found to take unusual actions to remain active. Remarkably, they do this even when given instructions to the contrary.