Do you often misplace your keys or spectacles? Or, absent-mindedly add salt to the dish twice, spoiling it? Imagine if you could get a message with a picture of where you left your keys or glasses? Or get a beep when you are about to add salt a second time? Or make an embarrassing mistake?

In a virtual reality powered world — or the Metaverse — where all of us could be wearing smart glasses, this may be possible. The glasses could even teach us how to play a flute or to cook. Why, they may even guide surgeons during complex procedures.

AI-backed system

This is not sci-fi. A global consortium of universities and technical colleges have teamed up for Ego4D, an ambitious project backed by social media major Facebook, to add another dimension to an Artificial Intelligence-backed advisory system.

“That AI is helping us in myriad ways is not something new. By analysing vast volumes of video, audio and text data, AI tools are giving us insights. But all the while, the systems are using static cameras hooked to the poles on the roads or in the corners in buildings,” explains CV Jawahar, Professor at Centre for Visual Information Technology at the International Institute of Information Technology (IIIT-Hyderabad).

Right now, he says, we are only getting a third-party view of the happenings. In order to make the cameras to help you in what you do, the cameras should have a first-person view.

“This means, cameras should be at the centre of the happenings. They should look at things like you do. You should make AI understand and perceive the world the way you do,” he says.

IIIT-H is the only institute from India to join this global initiative where 13 universities are pooling massive amounts of datasets, capturing the first-person view of a host of activities – from cooking to carpentry, and from agriculture to mechanical works.

Apart from India, datasets are being collected from the US, the UK, Italy, Singapore, Rwanda and Colombia to add scale and diversity to this research. The consortium has already got 3,000 hours of data from across the world, which will be released soon to researchers, start-ups and organisations to build products and solutions that can help people get intelligent, actionable insights into what they do.

“The pile-up of datasets will continue to grow. We are planning to contribute 1,000 hours of data by the end of December 2022. The IIIT-H has deployed about 20 match box size video cameras, worn on foreheads, to collect the data from different parts of the country, capturing several activities,” says Jawahar.

As he points out, the data have to be diverse because usability of the solution depends on the social, cultural and geographical factors and you need relevant data to produce relevant solutions for a geography.

“We are hurtling into the Metaverse faster than we could have imagined.”

comment COMMENT NOW