Video Moderation

Video moderation aims to detect malicious information in video files, and give the control suggestions for the moderation result. It supports detecting audio and video data in the file at the same time.

1. Introduction

The caller submits one or more moderation files, and specifies the detection type. The server returns the detection result through asynchronous callback.

Image supported detection types and their labels are listed below:

Detection Type Description Action Primary Label Secondary Label
Porn recognition Recognize porn and sexy contents in pictures v-porn
normal: normal normal: normal
sexy: sexy female_underwear: female underwear
female_sexy_chest_l12: female sexy chest level 12
female_sexy_chest_l3: female sexy chest level 3
female_sexy_chest_l4: female sexy chest level 4
female_backless: female backless
female_sexy_leg: female sexy leg
female_focus_leg: female focus leg
bathing_suit: bathing suit
male_topless: male topless
male_normal_topless: male normal topless
other_sexy: other sexy
porn: porn sex_product: sex aids
naked_private_part: exposed sensitive parts
extensive_naked: extensive naked
sex_behavior: sex behavior
naked_female_back: naked famale back
naked_hip: naked hip
sex_bulge: sex bulge
focus_female_crotch: focus female crotch
focus_male_crotch: focus male crotch
hand_on_sexy: hand on sexy
lick: lick
kiss: kiss
sm: SM
sperm: sperm
naked_child: naked child
other_dirty: other dirty
tongue_out: tongue out
female_focus_hip: female focus hip
male_underwear: male underwear
porn_pip: porn pip
Terrorism recognition Recognize bloody and terrorism contents in pictures v-terrorism normal: normal
fire_explosion: fire explosion
gun: gun
knife: knife
crowd: crowd
flag_of_terrorism: flag of terrorism
special_dress: special dress
disgusted: disgusted
with_weapon: with weapon
bloody: bloody
uniform: uniform
nil
Sensitive information recognition Recognize sensitive contents in pictures v-antispam normal: normal
special_building: special building
rmb: RMB
map_of_China: map of China
cartoons_of_leaders: cartoons of leaders
flags_of_China: flags of China
Tibetan_buddhism: Tibetan buddhism
other_antispam: Other sensitive information
tank: tank
fighter: fighter
cannon: cannon
battleship: battleship
nil
Sensitive figure recognition Recognize domestic and overseas politicians, and public figures in pictures v-sface normal: normal
sface: sensitive figure involved
nil
Illegal recognition Identify whether the picture contains illegal scene information v-illegal normal: normal
minor: minor
drug: drug
drive: drive
gamble: gamble
smoke: smoke
id_infomation: id infomation
tattoo: tattoo
inbed: lie on the bed
nil
AD recognition Identify whether the image contains advertising information v-ad normal: normal
QR_code: QR code
bar_code: bar code
applet_code: applet code
nil
OCR recognition Identify whether the picture contains suspected violation text information v-ocr normal: normal
ocr_politics: politics
ocr_terrorism: terrorism
ocr_porn: porn
ocr_illegal: illegal
ocr_abuse: abuse
ocr_ad: ad
nil

Voice supported detection types and their labels are listed below:

Detection TypeDescriptionActionLabel
Moan recognition in audioDetect voiceprint features in audio files, and recognize illegal features, such as moana-pornnormal: normal
moan: Moan
Sensitive word recognition in audioTranslate the text of audio files, and recognize sensitive and illegal contentsa-antispamnormal: normal
terrorism: terrorism
porn: porn
illegal: illegal
politics: politically sensitive contents
abuse: abuse
ad: advertisement cheating
feudalism: feudalism
religion: religiously sensitive contents
affairs: Affairs
contraband: contrabands
minors: minors
banned-website: banned websites
Automatic speech recognitionTranslate the text of audio filesa-asrnormal: normal

2. Restrictions

Restriction CategoryDescription
Video formatSupport .avi, .mp4, .asf, .wmv, and .mov. For other formats, contact customer support.
Limit on file sizeA single video does not exceed 200 MB. For larger videos, contact customer support.
Screenshot intervalScreenshot every 2 seconds
Concurrency restrictionYou can submit up to 20 videos for moderation per second, and the system processes up to 200 videos concurrently. For a higher concurrency capacity, contact customer support.
Save durationThe system will automatically save suspected illegal screenshots and audio clips, and return the file URL and detection result to the user.
These files will be kept for 3 hours. Their URLs may become invalid over 3 hours. Export the files in time.
Video resolutionAt least 128 x 128. The extra-low resolution may affect the recognition effect.
Area restrictionOnly in Chinese mainland. For support in other countries and regions, contact customer support.

3. API

3.1 Start the Task

Request APIs

ItemDescription
Request methodPOST
Request protocol HTTPS
Request domain nameai.jocloud.com
Request path/app/{appid}/v1/video/async?traceId=uuid-xxxx-xxxx-xxxx-xxxx
Request parameterstraceId is a uuid string, and used for problem positioning during troubleshooting. It is suggested to use different values for each request.
Request headerContent-Type: application/json;charset=UTF-8
token: authentication token; see its generation method in Identity Authentication
Request bodyjson string, defined as follows

Request Parameters

The request parameter, as a json object, is stored in the request body. The specific field is described below:

NameTypeRequiredDescription
actionsString arrayYesDetection type, options including:
v-porn: porn recognition
v-terrorism: terrorism recognition
v-antispam: sensitive information recognition
v-sface: sensitive figure recognition
v-illegal: illegal recognition
v-ad: ad information recognition
v-ocr: ocr information recognition

Audio detection type, options including:
a-porn: moan recognition
a-antispam: sensitive word recognition
a-asr: automatic speech recognition
data[]JSON arrayYesSpecify the detection object information list. Each element in the JSON array is a detection task structure (see Table "Request Data" below).
A single callback can process up to 5 videos each time.
callbackStringNoResult callback path, supporting HTTP/HTTPS callback.
Allow nulls. When the callback address is null, you can obtain the detection result through search APIs (suggested to receive the moderation result through callback).
sequenceStringNoThis value is used for the signature in the callback notification request. This field is mandatory for callback. See details about the usage in the description on callback of detection results.

Table: Request Data

NameTypeRequiredDescription
dataIdStringYesUnique data ID, for example: uuid-xxxx-xxxx-xxxx-xxxx
dataTypeStringYesData type. URL is mandatory.
contentStringYesHTTP address of a video file to be detected
extraJSONNoExtended parameters of audio; see the following table
contextJSONNoCustomized context data, automatically provided when a result is returned

Table: Extended parameters

NameTypeRequiredDescription
langStringa-antispam detection language, default to Chinese, optional value:
- chinese:Chinese
- bahasa:bahasa
nil

Return the Result

NameTypeRequiredDescription
codeIntegerYesError code, consistent with HTTP status code and also subject to extension, Error code in request
messageStringYesError message description
traceIdStringYesMap to traceId in the request parameter
requestIdStringYesThe unique request ID generated by the system for this request, used for subsequent result callback and status query.
timestampIntegerYesCurrent unix timestamp (s)

3.2 Callback of Detection Results

Callback Method

Upon completion of detection, the system accesses the user provided callback address using HTTP POST, and returns the detection result to the user.
To prevent content tampering, add a checksum item to the header of HTTP request, to verify content validity.

The checksum string is generated by the following method:

The sequence + body string data contained in parameters of the starting task generate the checksum value through the SHA256 algorithm.

Callback Contents

The detection result is saved in JSON structure in body, and the specific field is described as follows:

NameTypeRequiredDescription
codeIntegerYesError code, consistent with HTTP status code and also subject to extension, Error code in request
messageStringYesError message description
traceIdStringYestraceId content in the pass-through request parameter
requestIdStringYesThe system generates a unique task ID specific to this detection request
timestampIntegerYesCurrent unix timestamp (s)
data[]JSON arrayNoDetection result data list (for specific structure, see the table 'Returned Data' below). Each item in the array represents a processing result of one data, and this field may be empty in case of errors.

Table: returned data

NameTypeRequiredDescription
codeIntegerYesError code, Error code in data and action
messageStringYesError description
dataIdStringYesMap to dataId in the request
taskIdStringYesA unique task identifier generated for multiple detection types of this detection object
contextJSONNoMap to context in the request
results[]JSON arrayNoReturn the result data, and exist when the callback succeeds
Elements included in the return result mapping to inputted actions.
Each element is a structure, and represents the processing result of the mapping action
The structures of the results for different actions are listed in the following table.

Table: result

NameTypeRequiredDescription
codeIntegerYesError code, Error code in data and action
messageStringYesError description
actionStringYesDetection type, mapping to parameters of request actions
labelStringYesDetection result label. See the detection types and their labels above.
rateFloating-point numberYesProbability of detection result label, with the value ranging between [0.00 – 1.00]. The larger the value, the higher the credibility.
suggestionStringYesOperation recommended, with the value options:
- pass: normal, no operation needed;
- review: suspected, detection result uncertain, requiring further manual moderation
- block: illegal, suggested to give punishment
durationFloating-point numberNoReturn the length of detected audio as per the audio detection type
textStringNoText content of audio clips, provided only when action is "a-antispam" or "a-asr".
segment[]JSON arrayNo'review' and 'block' video frame or audio segment identification result list. Different actions correspond to different segment parameters, see the definition of each action segment below for details

The result structure of result->segment in different action detection is different, including the following situations:

(1)When action is v-porn or v-terrorism or v-antispam or v-sface or v-illegal or v-ad or v-ocr, the structure of result->segment as follow

NameTypeRequiredDescription
labelStringYesDetection label, the one with the largest rate in all face labels
rateFloating-point numberYesProbability of detection result label, with the value ranging between [0.00 – 1.00]. The larger the value, the higher the credibility.
suggestionStringYesSuggested operation
urlStringYesScreenshot address
timeOffsetFloating-point numberYesTime from screenshot capturing to video start
extraData[]JSON arrayNoThis field only exists when action is 'v-porn' or 'v-sface'.
For 'v-porn', the array saves the identified secondary label information, and the element structure is shown below porn table
For 'v-sface', the array is saved the face information of all people recognized in the screenshot, the element structure is shown below face table

Table: face

NameTypeRequiredDescription
labelStringYesDetected face label
rateFloating-point numberYesProbability of detection result label, with the value ranging between [0.00 – 1.00]. The larger the value, the higher the credibility.
nameStringYesDetected name of sensitive figure
xIntegerYesX-coordinate of the upper left corner of the detected face in the picture
yIntegerYesY-coordinate of the upper left corner of the detected face in the picture
wIntegerYesWidth of detected face
hIntegerYesHeight of detected face

Table. porn

NameTypeRequiredDescription
labelStringYessecondary label
rateFloating-point numberYesProbability of detection result label, with the value ranging between [0.00 – 1.00]. The larger the value, the higher the credibility.

(2)When action is a-porn, the structure of result->segment as follow

NameTypeRequiredDescription
beginFloating-point numberNoStart time of audio clips (s)
endFloating-point numberNoEnd time of audio clips (s)
scoreFloating-point numberNoMatching degree of moan, value ranging between:[0–100]. The higher the score, the higher the matching degree.

(3)When action is a-antispam, the structure of result->segment as follow

NameTypeRequiredDescription
beginFloating-point numberNoStart time of audio clips (s)
endFloating-point numberNoEnd time of audio clips (s)
extraData[]JSON arrayNoThe sensitive word recognition result list of this audio segment, the element structure is shown below antispam-extraData table

Table. antispam-extraData

NameTypeRequiredDescription
hintJSON arrayNoHit keyword
labelStringNoType of hit keyword
rateFloating-point numberNoMeaningless, always "1.0"

3.3 Synchronous Search of Results

Due to asynchronous processing, it is suggested to receive the processing result with the above asynchronous callback method. The results can be obtained through polling of synchronous APIs if necessary. The specific description is as below:

Request Method

ItemDescription
Request methodGET
Request protocol HTTPS
Request domain nameai.jocloud.com
Request pathapp/{appid}/v1/video/async/results?traceId=uuid-xxxx-xxxx-xxxx-xxxx&requestId=yyyy
Request parameterstraceId is a uuid string, and used for problem positioning during troubleshooting. It is suggested to use different values for each request.
requestId is the request ID to be searched, i.e. requestId carried in the return result of the task submitted for detection.
Request headerContent-Type: application/json;charset=UTF-8
token: Authentication token. See its generation method in Identity Authentication

Return Parameters

Data in body is JSON, and the specific field is described as follows:

NameTypeRequiredDescription
codeIntegerYesError code, Error code in request
messageStringYesError message description
traceIdStringYestraceId content in the pass-through request parameter
requestIdStringYesrequestId for the current search, consistent with the request parameter
statusStringYesTask status (received-pending, processing-in progress, completed-done)
timestampIntegerYesCurrent unix timestamp (s)
data[]ArrayNoReturn the result data. When it's successfully called (code==200), see the above table "returned data" for element definition.

4. Sample Code

The following shows the sample code of calling with python:

# -*- coding: utf-8 -*-
#! python3.5

import requests
import uuid
import base64

host = "https://ai.jocloud.com"
appid = 123456789  # Your Service id
restful_id = '********************'  # Your certificate ID
restful_secret = '********************'  # Your certificate key
traceid = str(uuid.uuid4())
dataid = str(uuid.uuid4())

# url
url = host + '/app/%s/v1/video/async/submit?traceId=%s' % (appid, traceid)

# headers
headers = {
    "content-type": "application/json"
}

auth = base64.b64encode(("%s:%s" % (restful_id, restful_secret)).encode('utf-8'))
headers['token'] = 'Base ' + auth.decode()

# The URL of the video file to be identified
file_url = 'http://newcntv.qcloudcdn.com/asp/hls/1200/0303000a/3/default/d67f1b655f0b49be87fdcc84f7f06029/7.ts'

# Context information to be used by the service to assist the subsequent treatment in the callback message, for example: 
context = {
    'myid': 123,
    'myname': 'test'
}

# Identification result and callback address of status, and the identification result and status notificafication are called back through http POST
callback_addr = 'http://mydomain.com/callback'

# content
values = {
    'actions': ['v-sface', 'v-porn'],
    'data': [
        {
            'dataId': dataid,
            'dataType': 'URL',
            'content': file_url,
            'extra': {},
            'context': context
        }
    ],
    'callback': callback_addr,
    'sequence': 'test'
}

# request
res = requests.post(url, json=values, headers=headers)
print ('url=%s\nbody=%s\ncode=%s\ndata=%s\n' % (url, values, res.status_code, res.text))

5. Update History

VersionTimeDescription
v2.2.32020-10-15Add new label in OCR recognition
V2.2.22020-10-14Add 'a-asr' action support
v2.2.12020-10-13Add 'inbed' label in Illegal recognition
v2.2.02020-08-31Add ad and ocr recognition; Update labels of 'v-terrorism' and 'v-antispam' recognition
v2.1.02020-08-24Add illegal recognition
v2.0.02020-08-17Add secondary labels in the porn recognition
v1.1.02020-06-30remove 'interval' and 'maxframes' of 'extra' params in start request params, fixed it to screenshot every 2 seconds
V1.0.02020-03-13Initial version

Was this page helpful?

Helpful Not helpful
Submitted! Your feedback would help us improve the website.
Feedback
Top