What is a Video Search Engine? Part V – Detecting Objects

After discussing the ability to video search for Speech, Text, Motion, Face and Emotions, the next big element class is “Objects”.

video search engine object detection

Essentially, it is an analysis tool that extracts metadata from a video and defines important objects or entities inside a video. Object detection has come to a point where it can detect objects or entities (like cat, flower, computer, etc.) and pinpoint the exact location there is the scene(s) appear inside the video.

The purpose of Object Detection is to help us better understand the overall content of our videos based on objects detected within the video. It also gives uas a time-based understanding on when each object appears within the video. Object Detection basically uses tagging and domain-specific models to identify content and label it with confidence.

Like other vision type of video searches, Computer Vision scientists developed Object Recognition based on the deep learning technology developed using deep neural network models to detect and label thousands of objects and scenes in videos.

video search engine object detection

With Object Detection, it is now possible to search every moment of every video file in your video library and catalog to find every objects as well as its importance.

To find out more about how you can detect objects inside your videos, visit VideoSpace Video Search Engine or our Video-Search-as-a-Service.

Announcement – Adding “Object Detection” to VideoSpace Search Engine

We are delighted to announce that we are adding “Object Detection” capability to our Video Search Engine.

This is a significant milestone as it enhances our already extensive video search capabilities. Thus, establishing VideoSpace Search Engine’s as one of the most powerful Video Search Engine in the world.

With Object Detection, this is the list of elements that we can index and search inside videos:

  • Objects (NEW!)
  • Speech
  • Text
  • Motion
  • Face
  • Emotion
  • Offensive Content
  • Custom (e.g. Logos, Objects, Landmarks, etc.)
videospace video search engine object detection

The new “Object Detection” feature enables us to detect entities (like cat, flower, computer, etc.) and pinpoint the exact location there is the scene(s) appear inside videos.

For example, in this short four minute VIDEO, we are able to:

  • detect 111 unique entities
  • mark exactly where these 111 entities appear

To find out more about Object Detection in videos, please click HERE.

What is a Video Search Engine? Part IV – Detecting Faces and Emotions

The ability to detect faces has been around for some time for real time CCTV systems. However, these systems out of reach for many as they are expensive and would need specialized implementation that would drive the cost up higher. Therefore, detecting faces from videos instead is a viable alternative because it instantly adds face detection ability to any CCTV system.

Detecting faces allows you to count, track movements by detecting unique faces. Face detection finds and tracks human faces within a video. Multiple faces can be detected and subsequently be tracked as they move around.

video search engine face detection

This will be useful to analyze human traffic within a mall, street or even a restaurant or café. It would be possible to identify and track movement of unique human faces. Therefore, it is possible to perform a headcount of human traffic within the video. 

Beyond detecting faces, it is more possible to detect emotions. Emotion Detection is an extension of the Face Detection video search that returns analysis on multiple emotional attributes from the faces detected, for example happiness, sadness, fear, anger, etc.

video search engine emotion detection

Recognizing the emotion of a person or crowd over time based allows us to track the emotional highs and lows within a particular time-frame. It also allows us to track someone’s emotions at a specific point of time. Answering questions like, how did the crowd react when the President makes a particular point? With emotion detection, it can be applicable to gauge audience responses in scenarios like:

  • Speeches
  • Focus groups
  • Group reactions
  • Interviews

Emotion detection can form a very good baseline for the scenarios above.

To find out more about how you can detect faces and emotions inside your videos, visit VideoSpace Video Search Engine or our Video-Search-as-a-Service.

What is a Video Search Engine? Part III – Detecting Motion

In Part I and II, we examined how we would be able to search Speech and Text inside videos. In Part III, we will look at one of the first names given to videos – “Motion” Picture. 

So, all videos have motion? That may not be true, not all videos have motion (or movement) all the time, especially in the case of security and surveillance videos.

video search engine - motion detection

Detecting motion in videos enables you to efficiently identify sections of interest within an otherwise long and uneventful video. That might sound simple with a single video, but what if you have 10,000 hours of videos to review every night? That’s a near impossible task to eyeball every video minute.

Motion detection can be used on static camera footage to identify sections of the video where motion occurs.

  • Detect when motion has occurred in videos with stationery backgrounds
  • Eliminate false positives caused because of light changes, shadows, small insects, and others

While there are motion sensors that can detect motion real-time, these systems tend to be expensive. Thus, the reason why most of the CCTV surveillance systems only does recording at best. Therefore, there are many scenarios that does not require real-time motion detection, like detecting a car entering a bus lane during peak hours.

video search engine - bus lane detection

Current technology has come to a point where it is able to differentiate between real motion (such as a person walking into a room), and false positives (such as leaves in the wind, along with shadow or light changes). This allows you to generate security alerts from camera feeds without being spammed with endless irrelevant events, while being able to extract moments of interest from extremely long surveillance videos.

To find out more about how you can detect motion inside your videos, visit VideoSpace Video Search Engine or our Video-Search-as-a-Service.

What is a Video Search Engine? Part II – Searching Text

In Part I, we found out that there are 7099 living languages in the world. That includes both written and spoken only languages. According to Ethnologue (20th edition) out of that 7,099 living languages, 3,866 have a developed writing system.

Which leads us to this second part of our series – Searching Text inside a video. Besides Speech, Text is probably the second most important element where we can extract data from.

For example, in a presentation or talk given by a speaker. Besides speech, the speaker would augment the session with a set of slides. Therefore, besides his voice, text (in the slides) is another set of data that can be captured. This is important because what he says and what he present in the slides can be vastly different.

Text that can be OCRed during a presentation

Text that can be OCRed during a presentation

The technology to capture these text inside the video is called Video OCR (Optical Character Recognition). Video OCR is derived from OCR, a technology that has been around a long time.

By strict definition, Optical Character Recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo or from subtitle text superimposed on an image (source: Wikipedia). The first OCR machine that read characters and converted them into standard telegraph code was invented by Emanuel Goldberg in 1914!

Unfortunately, one hundred years on, OCR technology still has some ways to go, especially in the field of adding more language capabilities and recognizing handwriting. However, with more A.I. and Machine Learning, the hope is that researchers can add more capabilities to what OCR can do now.

However, Video OCR is giving OCR a new lease of life by simply adding another dimension – moving images. Given the amount of videos that has never been OCRed before and the amount of videos being generated every day, the potential for Video OCR is immerse.

To find out more about how you can search TEXT inside your videos, visit VideoSpace Video Search Engine or our Video-Search-as-a-Service.

What is a Video Search Engine? Part I - Searching Speech

Of all formats, videos are the most difficult to search. Typically, current search engines can only search for "Title" and "Metadata" of the videos, which are manually keyed in by a human. There is no way to search the content inside the video. For example, how do you find a specific piece of news in a news clip? Or specific words that appear inside a video? How can you find them without actually watching the videos yourself?

Before we even get into the question of what is a video search engine, we need to have an understanding what can we search inside a video? Elements can include Speech, Words (or Text), Motion, Emotions, Faces and Objects.

Babbobox Video Search Engine

To kick of this “What is a Video Search Engine?” series, let’s tackle the most obvious of the elements – Speech.

In an hour, a person can say up to 9,000 words. Given the rate of videos are being produced today, that’s a lot of words. According to The Ethnologue catalogue of world languages, there are currently 7099 living languages. Obviously, Speech Recognition technology has not been able to keep with these vast number of languages. However, the good news is (depending on you see things), just 23 languages account for more than half of the world’s population.

Languages in the World (Source: www.ethnologue.com)

Languages in the World (Source: www.ethnologue.com)

On the technical aspect of searching speech in videos, the following process is required: 

  1. Transcribe (Speech-to-Text) – transcribing speech in the video
  2. Index - make the speech searchable
  3. Search - brings the users to exactly where the search terms are in the video.

The processes involved might sound simple, but the process of transcribing speech is filled with problems. There are factors that can affect the accuracy of speech recognition. For example:

  • heavy localized accent
  • low speech volume
  • bad diction
  • heavy background noise
  • multiple voices speaking at the same time

With the above in consideration, there are a lot of videos that are “not suitable” for machine transcribing: movies, TV shows, anything with mixed audio and sound effects, poorly recorded content with background noise (hiss).

To find out more about how you can search speech inside your videos, visit VideoSpace Video Search Engine or our Video-Search-as-a-Service.

Video Platform from a DevOps prespective - Babbobox CTO, Sabrina Lim at CloudExpo Asia 2017

Babbobox CTO, Sabrina Lim (yes.. she's a female CTO), will be speaking at CloudExpo Asia - DevOps Live at 12.05pm on 12 Oct.


Essentially, she'll be speaking about a combination technologies covering Media, Search, A.I., Cognitive and unstructured data (all the stuff that we are using) from the DevOps perspective. So yes... it'll be a bit geeky and techie!

So if you are at the show, do drop by and say hi!

UK, US... Here we come! Catch us at Microsoft Tech Summit Birmingham and Washington, DC!

Time to get out there and show the world our Babbobox Search Engine

We are excited to be in invited to participate in the exclusive Microsoft Tech Summit. As part of our expansion plan, we have decided take part on the events in Birmingham (UK) and Washington DC (US).

Here are the dates and venue details:


Birmingham, UK

Date: January 24-25, 2018
Venue: National Exhibition Centre (NEC) Birmingham B40 1NT United Kingdom
URL: https://www.microsoft.com/en-gb/techsummit/birmingham


Washington, DC, USA

Date: March 5-6, 2018
Venue: Ronald Reagan Building and International Trade Center
1300 Pennsylvania Avenue NW
Washington, DC 20004
URL: https://www.microsoft.com/en-us/techsummit/washington-dc


If you are in town then, do drop by and say hi! Watch this space for further Tech Summit updates!

#babbobox #videospace #videosearchengine #microsoft

Microsoft recognizes Babbobox as key global partner for Media Services

We are delighted to be listed as a key Microsoft global partner for Azure Media Services on http://amslabs.azurewebsites.net/. (Please scroll down)

Azure Media Services Babbobox Video Search Engine

This is in recognition of Babbobox's pioneering Unified Search Engine and Video-Search-as-a-Service where both services are World's First. 

"We are honoured to be invited by Microsoft to be part of this exclusive club of partners." said Alex Chan, Babbobox's CEO, "Considering Babbobox is still a relatively new entrant in comparison to other partners, to be invited into this elite group shows that Microsoft recognizes the huge potential in the things that we are doing. That Babbobox is pushing the boundaries and transforming the way we search in future."

Find out more about Babbobox Search Engine.
Watch this space for more updates!

Let's talk Unstructured Data! Babbobox CEO, Alex Chan, will be speaking at SWITCH 2017!

Babbobox CEO Alex Chan Switch 2017

Every one of us is bombarded with information through various media on a daily basis. Just how do we deal with all of it?

Babbobox CEO, Alex Chan, will be joined by Infini Video's Ken Lim to examine how data can be better managed such that users can navigate complex topics more easily.

Catch Ken and Alex at Campus Day as they share more on how we can use unstructured data efficiently!

Register here: http://eventregist.com/e/SWITCH-Pass-2017

#CPSG1 #CPDAY #SwitchSG17 #Babbobox