International Expert in Governance of Public Data & Inclusive Data Practices | Co-Founder | Director | Board Member

Silvana Fumega, PhD

As we close the second edition of the Global Data Barometer (GDB), I have been having many conversations with colleagues and friends about where the data field stands today. These chats remind me that nothing here is static. Priorities shift, technologies move faster than we expect, and even the words we use to describe our work are always up for debate.

This post is an invitation to scholars, funders, governments, and practitioners to pause and ask: which data truly matter, who defines their value, and to what ends they are used?

From Open Data to Ecosystems

When the open data agenda burst onto the scene more than a decade ago, the promise was simple: governments would release datasets and society would use them for transparency, innovation, and accountability. The Open Data Barometer, launched in 2013, gave us the first comparative picture of how countries were doing. It showed progress but also exposed big gaps.

By 2019, the conversation had already shifted. Publishing datasets on portals was no longer enough. Questions of data protection, interoperability, and sharing were taking center stage. That was the impetus for the Global Data Barometer, which tried to capture this broader landscape. The first edition in 2022 gave us a global baseline across 109 countries. The second, released in 2025, focused on Latin America, the Caribbean, and Africa, and tested a new way of looking at data ecosystems, linking governance foundations, critical skills, and thematic areas like land, company information, and AI.

What did we find? Progress, yes, but slow and uneven. Many countries have strategies and laws on paper, but enforcement is patchy. Datasets rarely talk to each other. Legal frameworks, infrastructure, and human skills seldom line up. The question may no longer be whether data exists, but whether ecosystems of people, institutions, policies, and technologies can actually work together to make it matter.

Open Data: Stuck in Time

By the second edition, the definition of open data was largely settled. Even with efforts to update it in 2023, the core elements had not evolved. Standards matured, and the debate about PDFs versus machine-readable formats felt like old history. Yet while the field moved on to data protection, interoperability, and AI, the concept of open data stayed still. The language persisted, but it no longer reflected today’s challenges.

Meanwhile, technology kept advancing. Generative AI can now read PDFs in seconds. Does that mean the problem is solved, or are we just covering old flaws with quick fixes instead of building real interoperable systems? Add the rise of vector data, from geospatial layers to the embeddings behind AI, and suddenly new questions emerge about usability, ownership, and fair access.

We risk clinging to outdated definitions or rushing into shiny new acronyms. Everyone is talking about AI strategies and digital public infrastructure. But what happens if the foundations remain weak? Are we just renaming old gaps? As researchers like Nagar and Eaves, and Mellon, Peixoto, and Sjoberg remind us, without strong foundations we risk building castles on sand. The way civic platforms are designed and the quality of digital infrastructures ultimately determine who benefits, and whether these new terms drive real change or simply repackage persistent inequalities.

Measuring Value

The question of measurement cannot be separated from the question of value. Both the European Union and South Korea have formalized approaches to this. Korea’s 2025 National Priority Data Opening Project is releasing high-value datasets to support AI training and business innovation, from renewable energy metrics to childcare programs. The European Union has embedded high-value datasets in law through the Open Data Directive, requiring member states to publish them under open conditions.

These initiatives define value through economic and technological lenses. In the open data field, value has also been tied to accountability. Both remain important, but they leave other dimensions in the shadows. If we only measure what fits into economic or accountability frames, what gets erased?

Even the economic story is less clear than it seems. Companies are often described as big users of public data, but the evidence is unsystematized. Do they really rely on what governments publish, or on proprietary sources? And when they do capture value from public data, how do we ensure those benefits flow back to society instead of concentrating power further? Too often, public data ends up reinforcing market concentration, where a few large firms have the resources to exploit it while smaller actors are left behind.

Trust and Synthetic Data

Value is inseparable from trust. Who decides what counts as “high-value,” and who gets to benefit? This is not just about choosing datasets but about confidence in the data itself.

Korea’s Ministry of the Interior and Safety, for example, has announced plans to disclose sensitive datasets using synthetic data or authenticity verification services. Synthetic data refers to information artificially generated to resemble real data without being drawn from actual people or events. Hospitals use it to create “fake” patient records that mimic health trends while protecting privacy. Banks use it to train fraud detection systems without exposing customer histories.

The goal is simple: make more data usable while still protecting privacy. But this raises deeper questions. If data is altered, simulated, or mediated through technical fixes, how can users be confident in its authenticity and reliability? Synthetic data may reduce risks, but it also blurs the line between what is original and what is generated. Who should be trusted to guard the integrity of information, whether governments, independent oversight bodies, or distributed communities? And what happens if none are trusted enough?

A Way Forward

Technology is moving faster than the rules, institutions, and skills needed to manage it. AI is already reshaping how we produce, share, and consume data, while many countries still struggle to enforce existing laws or connect fragmented systems. Infrastructure remains weak, resources scarce, and the people who should be overseeing these systems often lack support.

Adoption is also uneven. Without addressing gaps, we risk repeating the cycle: promises, pilots, and portals followed by frustration and inequality. The real challenge is not to pick the next buzzword but to ask harder questions. Which data really matter, for whom, and to what end? How do we build trust before the ground shifts again? Can we imagine ways of valuing data that go beyond economic utility or accountability checklists? Can we build ecosystems that distribute power instead of concentrating it?

Beyond the buzzwords, these are the questions that will shape the years ahead. I do not have definitive answers. What I have are doubts and provocations that can open new conversations. What I am sure of is that we need to keep asking them together. The Global Data Barometer can offer evidence and spark discussion, but it will take collective effort and fresh ideas to move forward. In future notes and posts, I will try to unpack these themes more fully, from trust and synthetic data to foundations and oversight to the shifting ways we define openness and value, with the hope of sparking debates and approaches we can shape together.

Posted in

Leave a comment