CSP Cognitive Engine

CSP Cognitive AI Engine is available on cloud and on-premise for big installations.

The AI Engine allows processing millions of videos
1. Allows full customization
2. Supports multiple languages
3. AI Engine performs Cognitive Analysis, Model Detection, Facial Recognition, Speech Recognition, OCR, Sentiment analysis, multiple speakers detection

Processing time: Depends on video length and quality, usually 1 minute video takes up to 5 minutes to process.
Video Sizes: The video file should be maximum 30GB and up to 4 hours maximum.

Main Turnkey Features

1. Face detection, detects and groups faces appearing in the video.
2. Face detection and labelling based on defined person models.
3. Image extraction for faces provides list of detected faces and position in the video stream
4. Observed People Tracking, detects observed people in videos and provides information such as the location of the person in the video frame and the exact timestamp (start, end) and confidence when a person appears.
5. Public figures identification, AI Engine for automatically identifies people, the engine is already trained with public figure faces allowing to detect public figures.
6. Visual text recognition (OCR), Extracts text that’s visually displayed in the video.
7. Visual content moderation, Detects adult and/or racy visuals.
8. Black frame detection, Identifies black frames presented in the video.
9. Animated characters detection, grouping, and recognition of characters in animated content
10. Converts speech to text over 50 languages. Supported languages include English US, English United Kingdom, English Australia, Spanish, Spanish(Mexico), French, French(Canada), German, Italian, Mandarin Chinese, Chinese (Cantonese, Traditional), Chinese (Simplified), Japanese, Russian, Portuguese, Hindi, Czech, Dutch, Polish, Danish, Norwegian, Finish, Swedish, Thai, Turkish, Korean, Arabic(Egypt), Arabic(Syrian Arab Republic), Arabic(Israel), Arabic(Iraq), Arabic(Jordan), Arabic(Kuwait), Arabic(Lebanon), Arabic(Oman), Arabic(Qatar), Arabic(Saudi Arabia), Arabic(United Arab Emirates), Arabic(Palestinian Authority) and Arabic Modern Standard (Bahrain) .
11. Automatic language detection, Automatically identifies the dominant spoken language. Supported languages include English, Spanish, French, German, Italian, Mandarin Chinese, Japanese, Russian, and Portuguese. If the language can’t be identified with confidence, AI Engine for Media assumes the spoken language is English.
12. Speaker enumeration, Maps and understands which speaker spoke which words and when. Up to Sixteen speakers can be detected in a single audio-file. 13. Speaker statistics, Provides statistics for speakers’ speech ratios.
14. Emotion detection, Identifies emotions based on speech (what’s being said) and voice tonality (how it’s being said). The emotion could be joy, sadness, anger, or fear.
15. Audio effects detection (preview), Detects various acoustics events and classifies them into different acoustic categories (such as Gunshot, Screaming, Crowd Reaction and more). The detected acoustic events are in the closed captions file. The file can be downloaded from the AI Engine for Media portal. For more information, see Audio effects detection.
16. Named entities extraction, Extracts brands, locations, and people from speech and visual text via natural language processing (NLP).
17. Sentiment analysis, Identifies positive, negative, and neutral sentiments from speech and visual text.