Authors: Ankush Agrawal, IIT Delhi, and Vikas Kumar, Azim Premji University
The growing clamour in India for evidence-based and targeted policymaking has not been matched by improvements in the quality of data. Whatever attention quality receives is restricted to data accuracy. By contrast, even though the National Statistical Commission flagged the timely release of data as a major issue more than a decade and a half ago, timeliness as a dimension of data quality has still not received much attention from the government, academia or the media.
In recent times, India has seen growing delays in the release of major databases despite the ‘technocratisation’ of policymaking, public professions of faith in evidence-based policymaking, the introduction of advanced data processing technologies and growing fascination with big and real-time data. Some of these developments were reflected in the campaign and manifesto of the Bharatiya Janata Party (BJP) for the 2014 parliamentary elections that engaged with government statistics. The Party’s campaign addressed the delays in releasing 2011 census data on religion, and its manifesto promised to use ‘technology to disseminate real-time data’ and to set up ‘an institute of big data and analytics for studying the impact of big data across sectors for predictive science’.
After coming to power, the BJP has not followed through, though. It has not yet released the full set of 2011 census data: so far only one of the migration tables has been released, and the complete general population table reports have not been published, which deprives policymakers of valuable qualitative information. Census tables on language await publication. The BJP government has also failed to release other reports including the Report of the High-level Committee on Socio-Economic, Health and Educational Status of the Tribals of India, the complete findings of the Rapid Survey of Children or the Study on Unaccounted Incomes in India.
Despite dramatic improvements in data processing technologies that should have drastically reduced the time involved, time between collection and release of religion and language data has increased since 1970.
The government released language tables from the 1961 census in 1964. This was when it published information about all languages spoken in the country, rather than just languages with more than 10,000 speakers, so the data released were more rather than less complex. By contrast, it released language tables of later censuses five to six years after enumeration. It has still not released 2011 language tables after seven years. Further delays would mean that these data will not be available until after the preparations for the 2021 census begin in 2019. Similar observations hold good for religion data.
In case of religion, it is possible to identify the date by which the census data might have been processed because the census questionnaire links the identification of caste of a person to her/his religious affiliation. This means that the caste and religion data have to be sorted together and can be released around the same time.
Indeed, until 1981, the data on scheduled castes, scheduled tribes and religion were released in the same year or within a year of each other. This near-concurrent release has not continued. Caste and tribe data from the 2011 census were available on 30 April 2013, when the primary census abstract was released. Many hoped that other tables would be released sooner than expected because the abstract was released ‘a year ahead of schedule’. Further, as per the National Statistical Commission’s recommendations the religion and language tables of the 2011 census should have been released by March 2014.
In 2004, the BJP-led National Democratic Alliance government failed to release the 2001 religion data. The Indian National Congress-led United Progressive Alliance government, which took credit for institutionalising the right to information, failed to release the 2011 religion data in 2014. It is noteworthy that both 2004 and 2014 were election years. Some of the key tables based on the 2011 census data on religion were finally released in August 2015 — 28 months after the primary census abstract’s release and 15 months after the 2014 elections.
The delayed release of data reflects the unwillingness of India’s deteriorating statistical system to face public scrutiny. The absence of cross-examination affects the quality of statistics, which in turn pushes the system into a vicious cycle of deteriorating data quality and diminishing trust in the system. Moreover, delayed releases make data obsolete for policymakers.
The growing delays in the release of government statistics are also an indicator of the government’s interference with India’s statistical machinery. If one focusses specifically on religion and language data from the census, the growing delays can be read as symptoms of the deepening communal crisis. The delays have been growing since 1981 — the period that witnessed the communalisation of Indian politics. Under these circumstances, the re-insulation of the statistical system from governmental and political interference should be an urgent priority.
Ankush Agrawal teaches economics at IIT Delhi.
Vikas Kumar teaches economics at Azim Premji University, Bengaluru.