Not many organisations and performers can afford live-streaming with multiple cameras, some mounted on cranes, capturing the feed from different angles to give a great viewing experience to audience.

Its here that Artificial Intelligence comes to play to help those who can’t deploy multiple cameras to capture the performance.

Imagine this. A guitarist introduces his fellow performers assisting him in the show. If you are looking at the streaming, you can’t see the emotion, if it is taken from a single camera as it is generally kept quite far from the stage.

“If you capture the performance from a single camera, you can’t really capture the mood of the actors or performers on the stage well,” Vineet Gandhi, a Professor with the Centre for Visual Information Technology (CVIT) of the International Institute of Information Technology (IIIT-Hyderabad), told Business Line .

The feed captured by a single camera is best suited for archival purposes. But it is unlikely that it can attract the attention of the viewers as it is difficult for them to fathom the feelings of the performers.

AL algorithms

 

A team of researchers, led by Vineet Gandhi, has developed AI algorithms that can identify persons on the stage, create virtual cameras to capture moods from the feed captured by the single camera.

Called GAZED, or eye gaze-guided Editing, uses feed captured by a solitary, static, wide-angle and high-resolution camera. It mimics a typical viewer’s gaze, which naturally moves as the object moves.

The team presented a paper titled ‘GAZED: Gaze-guided cinematic editing of wide angle monocular video recording’ at the Conference on Human Factors in Computing Systems2020. KL Bhanu Moorthy, Moneish Kumar and Ramanathan Subramanian have co-authored the paper.

The AI engine in GAZED simulates multiple virtual cameras from a single video input. Using the shots generated by these virtual cameras, the virtual editor picks up relevant shots and puts them on flow.

“There is no compromise on the quality of the shots. It depends on the quality of the feed captured by the physical camera,” he said.

 

Cinematic principles

The team has trained the algorithm on cinematic principles such as avoiding cuts between overlapping shots, avoiding rapid shot transitions and maintaining a rhythm.

“With cameras getting more and more sophisticated, the quality of the feed captured from a camera would further increase. This would further improve the output from the GAZED system,” he said.

The beauty of the system is that it doesn’t require much time to edit the feed generated by the virtual cameras. “A two-minute feed from a single camera can be edited in two minutes,” he said.

 

comment COMMENT NOW