In recent times the unavailability of reliable and timely data has tied the hands of policymakers and exposed critical gaps in official information systems. Most recently, official statistics on Covid-19 have been criticised for both delays and under-reporting of incidence and deaths amidst the government’s inept response, blamed, among other things, on the poor quality of its migration and economic statistics.

Questions on the quality of Covid-19 statistics are being raised in isolation from the larger problem of data deficit in the country that encompasses several other government sources of information such as the Census, National Sample Surveys (NSS), and National Accounts.

There is a need to understand the government’s ability to generate reliable information by locating its statistical systems in their wider political and economic contexts. The experience of India’s landlocked periphery stretching from Jammu and Kashmir to Mizoram aptly illustrates the context dependence of data quality.

Most peripheral States are constrained by locational disadvantages, including landlockedness, difficult terrain, dense forest cover, and lack of physical connectivity. This constrains the government’s developmental outreach. It also hinders people’s access to educational and healthcare facilities as well as markets.

Due to these disadvantages, the peripheral States were designated as special category States eligible for preferential federal support. Despite this preferential treatment, the periphery remains a site of sustained underdevelopment.

New Delhi’s top-down approach to development obsessed with investments in physical infrastructure has fuelled corruption in the form of heavy administrative overheads and stymied local entrepreneurship that is anyway hamstrung by insurgencies and remoteness.

As a result, the tax base of peripheral States remains small, making them dependent on preferential federal transfers that support the bloated public sector. Lack of opportunities outside the public sector forces the state governments to continuously expand public employment that further deepens their dependence on federal transfers.

However, despite massive overstaffing, the public sector has failed to keep up with the burgeoning ranks of unemployed college graduates. Unresolved insurgencies in these States have led to a democracy deficit.

Several peripheral States have not yet seen proper cadastral and cartographic surveys. Political disturbances have occasionally led to the cancellation of censuses in the periphery.

The NSS, the most important source of household level data on India, is unable to generate representative estimates of even basic socio-economic indicators for states/union territories like Jammu and Kashmir and Nagaland.

Data manipulation

In the past, these States, which account for barely five per cent of the country’s population, have accounted for more than half of all the sampling units in the country that could not be covered by the NSS. A lesser known aspect of the problem of data deficit is the manipulation of government statistics by people who are deeply dependent on public resources. A lack of trust in public institutions that distribute resources on the basis of reported numbers and measured disadvantages has triggered what can be called ethno-statistical entrepreneurship across India’s periphery.

In recent years, Nagaland, Kashmir, and parts of Manipur have witnessed widespread manipulation of censuses aimed at securing greater representation in the legislature and a larger share in the public pie. Underdevelopment constrains the government’s ability to invest in statistical systems.

The resultant poor quality data impede policy-making that contributes to underdevelopment, which feeds into political unrest that weakens (formal) democratic institutions. Political unrest affects data collection by physically obstructing surveys and undermines the provision of public goods including government statistics.

The data deficit, in turn, corrodes trust in public institutions that allocate power and resources using statistics, and aggravates political unrest. The periphery, therefore, finds itself trapped in a web of data, development, and democracy deficits.

One finds that similar problems plaguing government statistics in the internal periphery, which includes slums and tribal pockets in central India, that together account for a sizeable area and population of the country.

The growing clamour in developing countries like India for evidence-based policymaking overlooks the poor quality of government statistics. Attempts to mechanically fix data quality through the introduction of better data collection and processing tools alone will not help as the data deficit is embedded in a mutually constitutive relationship with democracy and development deficits.

Digital deficiency

Not coincidentally, the digital databases of the Indian government are deficient in the very areas where its traditional databases are incomplete or flawed. So, statistical reforms that resort to technological, legal, and administrative solutions without strengthening democratic institutions and attending to the development deficit are unlikely to succeed.

Reforms are needed in the medium to long run that strengthen the Right to Information and rebuild the autonomy of the judiciary, media, and the Comptroller and Auditor General — all of which are essential for a more effective oversight of government statistical machinery.

At another level, peaceful resolution of insurgencies, administrative reforms that improve the functioning of government bodies including statistical organizations, and electoral reforms and fiscal and political decentralisation that restore trust in government bodies as neutral arbiters are needed.

Multipronged and incremental statistical reforms supported by statistical audits are necessary because a major overhaul of the existing statistical bodies is difficult due to a resource crunch.

First, official statisticians should identify their constitutional, statistical, and policy obligations. They should restrict their activities to fulfil these obligations rather than collect data for a theoretically desirable set of variables or academically useful variables.

Second, official statisticians should be nudged to restore the earlier practice of releasing detailed methodological and descriptive reports that help understand the larger context of data.

Third, official statisticians should publish tabulation plans and a calendar for the release of data in advance to minimise growing political interference that delays the release of inconvenient statistics. Finally, there is a need for greater engagement between official statisticians and various stakeholders .

Inclusive public outreach will not only address the problem of non-response or obstruction of surveys on the ground but will also improve acceptability of published data.

Agrawal and Kumar teach economics at the Indian Institute of Technology Delhi and Azim Premji University, Bengaluru, respectively. This article is by special arrangement with the Centre for the Advanced Study of India, University of Pennsylvania.

comment COMMENT NOW