How India may become an AI powerhouse?

 As Artificial Intelligence becomes more mainstream, data will be a key asset. We need to create India-centric datasets that take India’s needs and rich diversity into cognizance.
As Artificial Intelligence becomes more mainstream, data will be a key asset. We need to create India-centric datasets that take India’s needs and rich diversity into cognizance.

Artificial Intelligence or AI is the ability of devices to display human-like intelligence using data. AI has become ubiquitous like electricity- invisible but enhancing multiple aspects of our lives- from the voice assistants in our mobile phones, recommendation engines in e-commerce websites and video streaming websites to navigation devices that help us find the best route. As the developed western economies and China take strides in Artificial Intelligence and data technologies, we look at measures to enable India to have a pivotal role in AI.

Data creation: Focus on India centric data

As Artificial Intelligence becomes more mainstream, data will be a key asset. We need to create India-centric datasets that take India’s needs and rich diversity into cognizance.

As per the 2011 census, less than 11 percent of Indians speak English. India should look at AI solutions that cater to Indian regional languages. While cheap data and mobile have made the internet accessible to almost every Indian, local language content is almost negligible, making vast portions of the internet inaccessible to most Indians. The Government’s National Translation Mission which aims to make knowledge texts accessible to students and academics in Indian languages, is a laudable effort. We must also encourage initiatives such as AI4Bharat led by IIT Madras which focuses on improving the digital experience of millions of users who access the web in local Indian languages. In parallel, we should step up efforts to improve audio analytics in regional Indian languages. This will help the illiterate, those with visual disabilities, and senior citizens.

Similarly, the ImageNet project, with over 14 million images, is among the most popular databases for images. Only 2.1 percent of all the images in ImageNet are from India. Many computer vision programs are initially trained on the ImageNet dataset. Given its lower representation of Indian images, many image classification outputs have results that are not well trained on Indian details. In 2018, the Delhi High Court was informed that the facial recognition system on trial by the Police to detect missing persons had an accuracy rate of only two percent.

The furore over misclassification of black ladies in the US led to concerted efforts to fix the issue. This resulted in an AI facial recognition tool improving the accuracy of identifying Black women from 65% to 96%. However, the same AI tool is only 85% accurate in recognizing Indian women! Over one in seven Indian ladies are misclassified.

More India specific datasets and a focus on India centric solutions will help the AI tools improve and provide better results for India specific use cases.

Data distribution: Frameworks for equitable data distribution

As we develop and store more India centric datasets, we need to focus on unifying the various data sources. A focused approach to ensure that the various data sources talk to each other is essential.

We should also develop robust frameworks for equitable data distribution and sharing. The Open Government Data (OGD) Platform India is a step in the right direction. Companies like Uber aggregate data about journeys made through cities by their drivers and, after removing personally identifiable data, shares them with local government in the US to inform ways in which cities may help optimize traffic patterns. Similarly, after removing personal identifiable information, aggregated data should be shared with academics and policymakers for improved research and decision-making.

Ownership: Promote Indian Ownership

India’s paucity of risk capital results in many startups and technology companies giving up ownership to foreign companies. Data analytics unicorns Mu Sigma is headquartered in the US. This follows unicorns like Moglix, Postman, Icertis, Chargebee, Gupshup, Innovaccer, Druva, HighRadius, and Zenoti, which are headquartered abroad. Unicorns such as Flipkart, PayTM, Ola, Oyo have significant and, in many cases, majority foreign ownership.

Some tech companies use dual class shares with differential voting rights in the US. For instance, the way Facebook (now Meta) shares and voting rights are structured, founder Mark Zuckerberg retains control despite public listing and multiple rounds of funding. Dual class shares where Indian promoters retain ownership should also be encouraged.

Policy measures that can help emerging AI companies stay Indian and not lose control of foreign countries should focus. Measures may include further simplifying compliance and regulatory environment for startups. We may explore giving preference for India-domiciled companies in sensitive sectors, including defense, policing and health care.

In summary, we are confident that focusing on India-centric data, better data sharing, and an enabling framework to promote startups remaining domiciled in India will help the growth of AI in India.

(The author is Dean, Wadhwani Institute of Technology and Policy at Wadhwani Foundation.)

Also Read:

The study laid an additional focus on measures taken by AI leaders and global capability centres (GCCs) or to embrace AI for their business growth.


A company like Uber exercising complete control over the independent contractors takes the economy back towards the pre-New Deal era, when the corporates had substantial power over workers and there were very little checks on their capacity to be exploited.

Go to Source