Jan 07, 2015

Robotic camera mimics human operators to anticipate basketball game action

(Nanowerk News) Automated cameras make it possible to broadcast even minor events, but the result often looks...well, robotic. Now scientists at Disney Research have made it possible for robotic cameras to learn from human operators how to better frame shots of a basketball game.

Many automated systems determine where to point the camera by tracking a key object, such as a lecturer. But human camera operators are able to anticipate action and can adjust the camera's pan, tilt and zoom controls to allow more space, or "lead room," in the direction that the action is moving. The result is video imagery that is smooth and aesthetically pleasing.

The broadcast footage of a sports event is calibrated to learn the pan-tilt-zoom configurations used by a camera operator. This information is combined with tracked player positions to build a structured predictor. Given unseen player positions from a new game, the learned predictor is used to generate target pan-tilt-zoom values for a robotic camera. As a result, the automatically generated broadcast footage looks more human-like. (click on image to enlarge)

Peter Carr, a Disney Research engineer, and Jianhui Chen, an intern and a Ph.D. student in computer science at the University of British Columbia, devised a data-driven approach that allows a camera system to monitor an expert camera operator during a basketball game. The automated system uses machine learning algorithms to recognize the relationship between player locations and corresponding camera configurations.

They will report their findings at WACV 2015, the IEEE Winter Conference on Applications of Computer Vision, Jan. 6-9, in Waikoloa Beach, Hawaii.

"We don't use any direct information about the ball's location because tracking the ball with a single camera is difficult," Carr said. "But players are coached to be in the right place at the right time, so their formations usually give strong clues about the ball's location."

Carr and Chen demonstrated their system on a high school basketball game. They used two cameras - a broadcast camera that was operated by a human expert and another that was a stationary camera that the computer used to detect and track the players automatically.

"Because the main broadcast camera in basketball maintains a wide shot of the court, we focused on predicting the appropriate pan angle of the camera," Carr said. Following supervised learning based on the operator's actions, the system was able to predict how to pan the camera in a way that was superior to the best previous algorithm and that did indeed mimic a human operator.

Carr said he expects the method can be adapted to other sports, possibly with additional features. Future work also will include mimicking the auxiliary cameras used for cutaway shots in multi-camera productions.

Source: Disney Research