April 13, 2026

How Eurostat Uses AI in Official Statistics: Risks and Limits

I watched Eurostat’s AI webinar. The message: use AI, but do not hand it the keys

When watching Eurostat’s webinar on artificial intelligence in official statistics, I was particularly struck by what the speakers did not do. And although it already dates from 17 October 2025 (not that old either of course), it is more relevant than ever.

So what didn’t they do? First of all they did not promise a revolution (unlike many AI prophets). They did not claim AI would replace statisticians. They did not present automation as a shortcut around professional judgment. Instead, the session made a narrower and more useful case: AI can make parts of official statistics faster, broader and easier to access, but only if the system keeps its discipline.

That means human oversight, methodological control, transparency and confidentiality stay in place from start to finish. The webinar was part of the World Statistics Day 2025 initiative. The official agenda listed Jean-Marc Museux, Chief Enterprise Architect at Eurostat; Brendan O’Dowd, Head of Data Science division at CSO Ireland; Petre Turliu, Team leader – Reference data management and data dissemination at Eurostat; and moderator Cristiano Tessitore, Statistical Officer – Innovation and Trusted Smart Statistics at Eurostat.

Instead of a general technology talk, it offered a discussion about official statistics, that is the branch of public information that depends more than most on consistency, legal safeguards, documentation and trust. It were those constraints that coloured the entire conversation. Again and again, speakers returned to the same principle. AI may help. It may accelerate. It may support. But it does not get to operate without supervision.

While I will explain in detail what has been discussed, I do invite you to watch the Eurostat webinar as well, it’s just under an hour.

Why Eurostat says AI is a tool, and not the objective

Cristiano Tessitore set the tone early. AI, he said, is “a tool, a catalyst and an opportunity, but not an objective in itself.”

The goal, as he described it, remains straightforward. Statistical offices need to deliver high-quality statistics faster. They need to develop indicators that better reflect rapid economic and social change. They need to make those outputs easier for journalists, researchers, policymakers and citizens to find and understand. If AI helps with that, it is useful. If it does not, it is a distraction.

That instantly removed a lot of the usual noise around AI. The webinar treated artificial intelligence as infrastructure. Speakers focused on routine but essential parts of the statistical pipeline: ingestion, classification, editing, anomaly detection, imputation, search, retrieval and user access. These are exactly the kind of functions where AI can matter.

How Eurostat is already using AI in official statistics

Jean-Marc Museux made the most strategic intervention of the session. He argued that official statistics now face two overlapping challenges. First, they need to measure how AI and related technologies are changing economies and societies. Second, they need to understand how those same technologies will change the work of statistical offices themselves. The official event page summarized his session as “AI and Innovation,” which turned out to be accurate in a very literal sense: his contribution was about where innovation is useful, and where it becomes risky.

Museux stressed that this work did not begin with the current wave of generative AI. Since the late 2010s, statistical offices have already been experimenting with machine learning as part of the broader big-data shift. In his telling, the logic is simple. Machine learning helps statistical offices detect patterns in large and complex datasets that would be costly or slow to process manually.

The best example he gave was online job advertisements. Eurostat has been collecting job-ad data from the web and using machine learning to assign standard occupation codes to those postings across Europe. That allows the production of experimental indicators on skills demand at EU level without running a traditional survey. It is a telling case because it shows where AI fits best inside public statistics: high-volume input, structured classification need, and a clear statistical purpose.

He also pointed to satellite data. Combined with surveys, those sources can improve crop statistics and produce more detailed statistical maps. Here again, AI is not replacing statistics. It is helping convert large flows of external information into usable statistical inputs.

Listening to that part of the session, the practical benefit became clear. AI in official statistics is about doing more of the underlying work with less delay, less manual strain and broader coverage.

What the AIML4OS project is trying to solve

Brendan O’Dowd then moved the discussion from strategy to programme design. He presented AI and Machine Learning for Official Statistics, or AIML4OS, a Eurostat-backed initiative that runs from April 2024 to March 2028. The project brings together 15 countries, around €4 million in funding and 13 work packages. Ireland leads the project coordination.

What stood out in O’Dowd’s session “AI-based solutions for the European Statistical System” was the problem the project is trying to solve. In public institutions, prototypes often stay prototypes. A useful experiment gets built, demonstrated and then left hanging because there is no route into production. AIML4OS is meant to close that gap.

O’Dowd described the project as a one-stop structure for statistical staff who need AI and machine learning resources. More precisely, the project is meant to build communities around AI, support the move from prototype to production, provide working use cases, create training material, develop common standards and offer a shared computing environment for experimentation.

That production angle is more than key. Public-sector AI often fails because the surrounding system is missing and not because the used model is weak. What lacks are standards, infrastructure, governance path, training material, and a shared implementation logic. AIML4OS is trying to build exactly that missing layer.

Which AI use cases matter most for statistical offices

Several of the AIML4OS work packages made it easier to see where statistical offices expect AI to have the strongest near-term effect.

One work package focuses on earth observation data. The idea is to use AI models to turn satellite imagery into statistical outputs. Crop mapping is one example as Museux explained earlier in said webinar. Land-cover classification is another. Those tasks involve very large image datasets and clear classification goals, which makes them a natural fit for machine learning.

Another work package focuses on text-to-code systems. This may sound narrow, but it is one of the clearest use cases discussed in the webinar. Statistical offices frequently receive text describing occupations, products or business activities. That text then has to be mapped into standard classifications. Much of that process still involves manual work. AI offers a way to automate part of that burden.

Then there is the work package on large language models. O’Dowd said the team had already developed prototype directions after a hackathon in Lisbon. These include summarising groups of datasets, extracting data from annual reports and retrieving targeted content from company websites.

Taken together, those examples reveal something important about AI in official statistics. The strongest use cases are specific and sit inside a known workflow. They solve a repeated bottleneck and can be tested against quality requirements.

Why Eurostat is building a chatbot grounded in its own data

If Museux and O’Dowd focused more on production, Petre Turliu focused on user access. His session, “Eurostat’s experience using generative AI to ‘talk with data’,” was about a new initiative to let users interact semantically with Eurostat’s content.

This was one of the most interesting parts of the webinar to me because it addressed a change that is already visible far beyond statistics. People no longer approach information only through websites, menus and PDF reports. More of them now expect to ask a question in natural language and get back a direct answer, with context.

Turliu’s point was that this shift creates both an opportunity and a risk. Users want more than raw figures. They want metadata, definitions, methodology, glossaries and links between datasets. But general-purpose AI chatbots are a poor fit for that task because they rely on training data users cannot properly inspect and because they are built to answer even when they do not know. That creates the well-known risk of hallucinations.

Eurostat’s response is to build a retrieval-augmented system based on its own public material. The project involves refining public content, analyzing its structure, feeding it into a retrieval layer and connecting the solution to the European Commission’s LLM service. The goal is a system that can work from Eurostat’s own datasets, metadata, glossary entries, articles and documentation.

During the following Q&A, Turliu also gave one of the most practical answers of the session. Asked whether the chatbot would be able to respond with live dataset information, he said yes: the system is being designed to retrieve live data dynamically, which should make the returned figures current rather than frozen in an outdated model snapshot.

What “AI-ready data” actually means

At a certain point in the Q&A Museux was asked what it means to make data “AI-ready.” His answer was crisp enough to serve as a rule for public data publishing well beyond Eurostat.

AI-ready data, in his view, are data structured so machines can use them safely and effectively. That means they must be well organized, clearly described, machine-readable and supported by metadata and documentation that explain what the data represent. It also means they must be unbiased enough for AI systems to work from them without amplifying distortions.

That answer shifts the AI discussion away from models alone. A lot of the future of AI in official statistics will depend on the quality of the data layer underneath it. A glossy interface will not rescue weak metadata. A language model will not fix bad structuring. If official statistics are going to move through AI-driven information channels, then the underlying data have to be legible to machines without losing their meaning for humans.

The risks Eurostat kept returning to

The most consistent thing about the webinar was that no speaker tried to hide the downsides.

Museux for instance spoke at length about confidentiality. Protecting information supplied by citizens and businesses is not optional in official statistics. AI can intensify some disclosure risks because advanced systems make it easier to combine sources and potentially re-identify individuals or entities. He mentioned synthetic data as one possible safeguard, but he was careful not to present it as a perfect fix.

Transparency was another recurring issue. AI systems can produce useful output while remaining hard to explain. That is a problem in official statistics, where reproducibility and documented reasoning matter. Museux argued for open-source code where possible and for better ways of measuring and communicating uncertainty in AI-assisted output.

Then there was the issue that now shadows almost every public AI deployment: hallucinations. Turliu addressed it directly in the dissemination context. Museux addressed it more broadly. The common message was that raw AI output cannot be treated as reliable until humans have checked it.

O’Dowd also added the governance layer. He said responsible AI is simply non-negotiable and stressed that confidentiality, data security and non-disclosure remain central. He also pointed to work on standards and ethical AI inside the project and to cooperation with international groups already working on responsible AI in statistics.

Why human oversight remains the hard limit

By the end of the webinar it was pretty clear that humans (must) stay in the loop.

You can consider it to be the operating rule that keeps official statistics credible. AI can classify, suggest, summarize, retrieve and automate. It can and will cut time and widen access. But it certainly does not carry institutional responsibility. It does not decide what level of uncertainty is acceptable. It does not guarantee confidentiality. And it does not replace methodological scrutiny.

Watching the session, I found this to be the real dividing line between Eurostat’s approach and the louder rhetoric that often surrounds AI. The institution is integrating the technology but is at the same time doing it under public-sector conditions, and not according to platform logic.

That distinction is rather important, because official statistics are public infrastructure. They form policy, markets, reporting and democratic debates. If that infrastructure starts using AI more deeply, the test is whether the statistics remain trustworthy.

What this webinar revealed about the future of AI in official statistics

The webinar made one thing unmistakable clear. AI is definitely inside the European statistical system. It is moving into classification, text coding, earth observation, dissemination, metadata access and public-facing search. Projects are funded. Work packages are active. And prototypes are under way.

But the more interesting point – according to me at least – is how Eurostat is choosing to move. Eurostat wants the gains from AI, but not at the cost of losing what makes official statistics authoritative. That is why the speakers kept returning to the same safeguards. Trust. Confidentiality. Transparency. Explainability. Human review.

In a field where public institutions are under pressure to modernize quickly, that restraint sounded like competence. And competence gives trust.


Become a Sponsor

Our website is the heart of the mission of WINSS – it’s where we share updates, publish research, highlight community impact, and connect with supporters around the world. To keep this essential platform running, updated, and accessible, we rely on the generosity of you, who believe in our work.

We offer the option to sponsor monthly, or just once choosing the amount of your choice. If you run a company, please contact us via info@winssolutions.org.

Select a Donation Option (USD)

Enter Donation Amount (USD)