Algorithm searches for human actions in videos

Coverage Type: 

An algorithm has been developed to automatically recognise human gestures or activities in videos in order to describe what is taking place.

MIT postdoc Hamed Pirsiavash and his former thesis advisor Deva Remanan from the University of California at Irvine have used natural language processing techniques in order to improve computers' ability to search for particular actions within videos -- whether it's making tea, playing tennis or weightlifting.

The activity-recognising algorithm is faster than previous versions and is able to make good guesses at partially completed actions, meaning it can handle streaming video. Natural language processing has been applied to computer vision in order to break down the different components involved in any action in the same way that sentences are divided down into different elements.

The researchers essentially came up with a type of grammar for human movement, dividing up one main action into a series of subactions. As a video plays, the algorithm constructs a set of hypotheses about which subactions are being depicted and where, and ranks them according to probability. As the video progresses, it can eliminate hypotheses that don't conform to the grammatical rules, which then dramatically reduces the number of possibilities.

Pirsiavesh believes that the system may have medical applications, including checking that physiotherapy exercises are being carried out correctly or the extent to which motor function in patients with neurological damage has returned.

Algorithm searches for human actions in videos Who did what? (MIT)