May 2025
National data inventories: A trial database to kickstart the process
Few African countries maintain comprehensive inventories of their data infrastructures. Uganda’s Portfolio of Government System Services is a rare example. Inventories such as this should be an essential element in governments’ efforts to prioritise digital transformation and to manage the often disruptive and duplicative interventions of donors and international organisations. Contrary to current dogma, all public information systems should be considered digital public infrastructure, and in a world of dwindling finances, the rationalisation of these resources cannot take place without reliable intelligence.
I’ve had an interest in monitoring the development of information systems in Africa for fifteen years. It started with a website where I copied and pasted news articles garnered from RSS feeds and progressed into a methodology to asses development data which led to a number of consultancies producing national data landscape diagnostics.
Building national inventories should be a relatively straight forward exercise for governments to perform and maintain. But given current inertia is there a way of kickstarting this process from outside of government? While most systems remain off line and are therefore not directly discoverable, the amount of online content describing or referencing systems is growing. A job for a large language model?
In most comparisons of LLMs Perplexity emerges as the most suited to research and information retrieval. So, armed with a subscription to Perplexity Pro I decided to see give it a test drive. A summary of my methodology can be found below.
The current (first) version of the resulting database can be found here. It contains 5,315 records of systems across 53 countries, 21 sectors and 122 subsectors.
Is this a credible and authoritative dataset?
Clearly not. It contains many ‘hallucinations’ and system modules treated as main systems, and cross-cutting systems duplicated across sectors.
But is it useful?
In my opinion, definitely yes.
It provides a starting point for building national and sectoral inventories.
It can provide a stimulus for governments to take ownership of this process.
It shows the growing potential of harvesting data from the internet.
The biggest challenge in employing LLMs is learning how to provide accurate and unambiguous questions. Perplexity appears to be very responsive in correcting errors when they are exposed.
It very clearly demonstrates that there is more to digital transformation than identity, payments and data exchange.