Recently, NITI Aayog released a report, titled “India’s Data Imperative: The Pivot Towards Quality.”
- The report underscores the urgent need for robust data quality to fortify digital governance, cultivate public trust, and ensure efficient service delivery.
- The report critically examines the pervasive challenges posed by poor data quality and introduces practical, easy-to-use tools.
About Data
- Data refers to raw facts and figures that are collected, measured, or observed.
- It can be quantitative (numerical values) or qualitative (descriptive characteristics).
- Data, in its raw form, does not have meaning or context until it is processed, analyzed, and interpreted.
Key Data Points

- Unified Payments Interface (UPI)
- Transactions Processed: 17.89 billion transactions in April 2025.
- Total Value: ₹23.9 trillion, rivalling the monthly GDP of several mid-sized economies.
- Aadhaar Authenticated: 27.07 billion in FY 2024-25.
- Ayushman Cards in Circulation: Over 369 million cards.
- DigiLocker: 46.52 crore users as on 1 February 2025.
- Internet Connections: 96.96 crore Connections in June 2024
- Broadband Connections: 94.92 crore Connections in August 2024
Data Quality: Shift from Scale to Precision
- Data quality refers to the accuracy, consistency, completeness, timeliness, validity, and reliability of data.
- High-quality data accurately represents the real-world entities or events it is supposed to describe, meets the specific needs of users, and can be used effectively to support decision-making, policy formulation, and service delivery.
- India’s digital public infrastructure (e.g., UPI, Aadhaar, Ayushman Bharat) has achieved unprecedented scale, but the next decade demands a shift to data quality to sustain trust and efficiency.
Why Data Quality Matters:
- Fiscal Efficiency and Resource Allocation
- Fiscal Leakage: Poor data, such as erroneous or duplicate beneficiary records, leads to unnecessary fiscal leakage.
- For instance, 4-7% of annual welfare expenditure is inflated due to inaccurate or duplicate records.
- Wasted Resources: Inaccurate data results in inefficient allocation of resources.
- Example: Removing 17.1 million ineligible names from the PM-Kisan list saved approximately ₹90 billion.
- Policy Effectiveness and Precision: Inconsistent data distorts policy decisions and delays necessary adjustments.
- Example: Mismatched data on land titles delayed crop-loss compensation in one district.
- Public Trust and Governance: Mismatched records or rejected claims erode citizens’ trust in digital platforms.
- Citizens lose confidence, which undermines welfare initiatives.
- Example: Incorrect Aadhaar details have blocked pensions and health benefits for many citizens.
Efficiency in Service Delivery: Poor data slows down service delivery.
-
- Example: A wrong IFSC code in PM-Kisan records delayed subsidy transfers.
- AI and Data-Driven Governance:
- AI Models Depend on Data Quality: If the data used for AI models is inaccurate or incomplete, it can result in incorrect predictions or misguided decisions that undermine public welfare.
- AI Hallucinations: Bad data in AI systems can lead to hallucinations—when the AI generates wrong or misleading outputs.
- The Cost of Poor Data Quality: Incorrect beneficiary information can stall welfare payments, and duplicate records can inflate government expenditures.
- Fixing these errors often requires expensive, manual reconciliation processes, which further waste public resources.
- It took 2 years to fix bulk LPG rejections due to data errors.
- Data Quality Debt: Over time, allowing poor data quality to persist builds a data debt—an accumulation of unaddressed errors and inefficiencies.
- This leads to a compounding problem, where issues get harder to fix as more data is added to the system.
Challenges in India’s Data Ecosystem
- Systemic Design Flaws: Data platforms often prioritize quantity over accuracy, leading to poor data quality. Mistakes recur due to incentives focused on speed rather than precision.
- Example: During a PM-Kisan linkage drive in 2017, 4.4 lakh ghost students were found claiming midday meal funds.
- Data Fragmentation: Data is stored in silos, with incompatible formats across ministries and departments. Data sharing and integration become manual and time-consuming.
- Legacy Systems, Modern Pressures: Many core systems are based on outdated technology with no validation or audit trails. Minor updates can disrupt the system, leading to data errors and service delays.
- Example: In 2024, ration card readers failed to recognize elderly fingerprints, delaying subsidized grain distribution.
- Lack of Accountability: No clear data ownership across systems leads to unaddressed errors. Errors persist, and there is no accountability for fixing them.
- Example: In late 2022, 17,000 health insurance cards were blocked due to identity issues, with no clear owner to fix the problem.
- Speed Over Accuracy: Rushed data entry is incentivized, leading to poor-quality data. Faster enrollment targets sacrifice accuracy, leading to costly errors.
- Example: During a 2013 LPG subsidy rollout, only 60% of households were correctly linked, and 40% were bulk-rejected to meet speed targets.
- Low Expectations: 80% accuracy is often considered good enough, allowing errors to persist. This low-bar mindset leads to chronic data errors that grow over time.
- Example: In 2019, a state declared itself open defecation-free, but an audit in 2020 found that nearly half of rural homes still lacked toilets.
Data Governance in India
- India’s journey in data collection and management dates back to 1881 with the first Census.
- Over the years, institutions like the National Sample Survey Organization (NSSO) and Central Statistical Organization (CSO) have played a vital role in data-driven governance.
- With the rise of digital technologies, Management Information Systems (MIS) have become more prevalent, helping government schemes and programs track inputs, outputs, and outcomes.
- For Example:
- HMIS (Health Management Information System) for the National Health Mission tracks healthcare indicators at various levels (village, district, state).
- The Digital India program (since 2015) aims to make government services accessible online, supported by initiatives like the Pratibimba dashboard in Karnataka.
Data Governance Quality Index (DGQI)
- The Data Governance Quality Index (DGQI) was launched in 2020 by the Development Monitoring & Evaluation Office (DMEO), NITI Aayog.
- It is a comprehensive tool to assess the data preparedness of government ministries and departments.
- The first phase, DGQI 1.0, focused on data systems, while DGQI 2.0, launched in 2021, expanded to include data strategy and data-driven outcomes.
Centre for Data Management and Analytics (CDMA)
- It was established in June 2016 under the Comptroller and Auditor General of India (CAG), based on the recommendations of the Task force on Implementation of Big Data Management Policy.
- CDMA is the nodal centre for data analytics in Indian Audit and Accounts Department and provides guidance to field offices on data analytics and pioneer research and development on future direction of data analytics.
|
Government Measures for Improving Data Quality
- National Data Governance Framework Policy (NDGP 2022): To address the shortcomings in the current data governance ecosystem and enhance data quality across government departments.
- India Data Management Office (IDMO) will be established to oversee data standards and sharing.
- Implementation: Not fully rolled out yet but is expected to play a key role in future data governance
- Open Data Initiative (data.gov.in): To promote data transparency, accessibility, and standardization.
- data.gov.in provides a public platform for sharing high-value government data in machine-readable formats.
- Encourages public and private sector collaborations to use the data for research, innovation, and policy-making.
- National Data Analytics Platform (NDAP 2022): The NDAP was launched by NITI Aayog to improve the discoverability and accessibility of government datasets.
- Provides a centralized platform for accessing standardized government data from various ministries and departments.
- Ensures real-time data updates and supports data-driven research and analysis.
- Open Data Telangana (2016): To improve data transparency and public participation in governance.
- It has been successfully running since 2017 and supports data transparency and collaboration between government departments and citizens.
- Chief Data Officers (CDOs) in Ministries: To institutionalize data governance and improve data quality at the departmental level.
- Appointment of CDOs in ministries to oversee data quality and data stewardship.
- CDOs are responsible for ensuring data standards, validation, and compliance across their respective departments.
Global Examples
- Singapore – Data Governance and Transparency
- The Government Data Office oversees data sharing across ministries.
- Ensures transparent, standardized data, boosting public trust and efficiency in services.
- New Zealand – Integrated Data Infrastructure
- IDI merges data from various sectors (health, education, etc.).
- Provides comprehensive insights for policy-making and targeted interventions.
- Australia – Data Strategy and Chief Data Officers
- Appoints Chief Data Officers in departments to manage data quality.
- Enhances cross-department collaboration and data interoperability.
- Estonia – Digital Governance and e-Residency
- Digital government services and e-Residency for global access.
- Streamlines efficient service delivery with secure, quality data.
- United States – Open Data and Data Quality Assessment Framework
- Open Data Initiative and DQAF for assessing data quality.
- Promotes data transparency and reliability in public services.
- United Kingdom – National Data Strategy
- The Government Data Office implements the National Data Strategy.
- Ensures data standards, improving coordination and service delivery.
|
Way Forward for Improving Data Quality
- Institutionalizing Data Ownership: Designate data custodians at national, state, and district levels.
- Ensure clear accountability for maintaining data quality.
- Incentivizing Data Quality: Reward accuracy and completeness of data rather than speed.
- Align performance reviews with data quality metrics such as error rates and timeliness.
- Use data scorecards in welfare programs to monitor and improve data quality.
- Enhancing Interoperability: Develop common data formats and schemas for seamless data exchange between systems.
- Ensure smooth integration of data across departments and platforms.
- Aadhaar-linked schemes like DBT should operate with consistent data formats across systems.
Tools used to Improve Data Quality: NITI Aayog
- Data Quality Scorecard: Tracks key data quality attributes like accuracy, completeness, and timeliness across government departments.
- Usage: Helps identify gaps and take corrective actions.
- Data Quality Maturity Framework: Assesses data practices and maturity levels (Foundational to Institutionalized).
- Usage: Helps track progress and create improvement plans.
- Starter Kit for Quick Wins: Provides practical actions for quick data quality improvements.
- Features: Real-time validation, assigning data stewards, and linking grievance redressals to data corrections.
- Data Custodianship and Ownership Tools: Assigns data stewards to ensure data integrity.
- Usage: Oversees continuous updates and corrections, especially in high-value datasets like Aadhaar.
- Data Interoperability Framework: Ensures smooth data exchange across systems while maintaining integrity.
- Usage: Facilitates integration between Aadhaar, UPI, and PM-Kisan systems.
- Automated Data Entry and Validation Tools: Reduces human errors during data entry with real-time validation.
- Usage: Used in PM-Kisan to prevent errors like transposed digits or incorrect beneficiary data.
|
- Automating Data Validation: Implement real-time validation at the point of data capture.
- Minimize data entry errors by preventing incorrect data from entering the system.
- Automate data entry checks for PM-Kisan to prevent transposed digits.
- Promoting Data Stewardship Culture: Build a culture of data stewardship across government levels. Foster shared responsibility for data quality, making it integral to public service delivery.
- Use data quality frameworks to assess and improve data management practices in ministries.
- Ensuring Data Security and Privacy: Develop data governance frameworks that prioritize security and privacy.
- Follow global best practices for data privacy like those in Singapore and Australia.
- Regular Audits and Quality Checks: Establish periodic audits and quality checks for high-value datasets.
- Regularly update and verify databases like Aadhaar to maintain data consistency.
Conclusion
India’s shift from scaling digital infrastructure to prioritizing data quality is critical for sustaining trust, enhancing governance, and driving AI-powered public services. By institutionalizing ownership, incentivizing accuracy, and fostering interoperability, India can build a robust data ecosystem that ensures precision, efficiency, and equitable service delivery.
To get PDF version, Please click on "Print PDF" button.