In the last installment, we explained:
- Why Video Big Data will absolutely dwarf current Big Data
- How Video is the most difficult medium to extract data from
Which explains why Video Big Data remains a largely unexplored field. But also means the intense opportunities available because we have not even scrap the tip of this huge data iceberg.
In this installment, we will examine the kind of data elements that we can extract from videos.
In a hour of video, a person can say up to 9,000 words. So imagine the amount of data just from speech alone. However, the process of transcribing speech is filled with problems and we are currently only starting to get an acceptable level of accuracy.
Besides speech, text is probably the second most important element inside videos. For example, in a presentation or lecture, besides speech the speaker would augment the session with a set of slides. Or news tickers appearing during a news broadcast.
There are thousands of objects inside a video within different timeframe. Therefore, it can be quite challenging to identity what objects are in the video content and in which scene they appear in.
The difference between video and still images is motion. Different video scenes contain complex activities, such as “running in a group” or “driving a car”. Ability to extract activities will give a lot of insight what the videos are about. This includes offensive content that might contain nudity and profanity.
Detecting motion enables you to efficiently identify sections of interest within an otherwise long and uneventful video. That might sound simple, but what if you have 10,000 hours of videos to review every night? That’s a near impossible task to eyeball every video minute.
Detecting faces from videos adds face detection ability to any survelliance or CCTV system. This will be useful to analyze human traffic within a mall, street or even a restaurant or café. When we include facial recognition, it opens up another data dimension.
Emotion detection is an extension of the Face Detection that returns analysis on multiple emotional attributes from the faces detected. With emotion detection, one can gauge audience emotional response over a period of time.
This list of video data is certainly not exhaustive but is a definitely a good starting point to the field of Video Big Data. In the next installment, we will examine some of the techniques used to extract these video data.
The Babbobox Team