Video moderation aims to detect malicious information in video files, and give the control suggestions for the moderation result. It supports detecting audio and video data in the file at the same time.
The caller submits one or more moderation files, and specifies the detection type. The server returns the detection result through asynchronous callback.
Image supported detection types and their labels are listed below:
Detection Type | Description | Action | Primary Label | Secondary Label |
---|---|---|---|---|
Porn recognition | Recognize porn and sexy contents in pictures | v-porn | ||
normal: normal | normal: normal | |||
sexy: sexy | female_underwear: female underwear female_sexy_chest_l12: female sexy chest level 12 female_sexy_chest_l3: female sexy chest level 3 female_sexy_chest_l4: female sexy chest level 4 female_backless: female backless female_sexy_leg: female sexy leg female_focus_leg: female focus leg bathing_suit: bathing suit male_topless: male topless male_normal_topless: male normal topless other_sexy: other sexy |
|||
porn: porn | sex_product: sex aids naked_private_part: exposed sensitive parts extensive_naked: extensive naked sex_behavior: sex behavior naked_female_back: naked famale back naked_hip: naked hip sex_bulge: sex bulge focus_female_crotch: focus female crotch focus_male_crotch: focus male crotch hand_on_sexy: hand on sexy lick: lick kiss: kiss sm: SM sperm: sperm naked_child: naked child other_dirty: other dirty tongue_out: tongue out female_focus_hip: female focus hip male_underwear: male underwear porn_pip: porn pip |
|||
Terrorism recognition | Recognize bloody and terrorism contents in pictures | v-terrorism | normal: normal fire_explosion: fire explosion gun: gun knife: knife crowd: crowd flag_of_terrorism: flag of terrorism special_dress: special dress disgusted: disgusted with_weapon: with weapon bloody: bloody uniform: uniform |
nil |
Sensitive information recognition | Recognize sensitive contents in pictures | v-antispam | normal: normal special_building: special building rmb: RMB map_of_China: map of China cartoons_of_leaders: cartoons of leaders flags_of_China: flags of China Tibetan_buddhism: Tibetan buddhism other_antispam: Other sensitive information tank: tank fighter: fighter cannon: cannon battleship: battleship |
nil |
Sensitive figure recognition | Recognize domestic and overseas politicians, and public figures in pictures | v-sface | normal: normal sface: sensitive figure involved |
nil |
Illegal recognition | Identify whether the picture contains illegal scene information | v-illegal | normal: normal minor: minor drug: drug drive: drive gamble: gamble smoke: smoke id_infomation: id infomation tattoo: tattoo inbed: lie on the bed |
nil |
AD recognition | Identify whether the image contains advertising information | v-ad | normal: normal QR_code: QR code bar_code: bar code applet_code: applet code |
nil |
OCR recognition | Identify whether the picture contains suspected violation text information | v-ocr | normal: normal ocr_politics: politics ocr_terrorism: terrorism ocr_porn: porn ocr_illegal: illegal ocr_abuse: abuse ocr_ad: ad |
nil |
Voice supported detection types and their labels are listed below:
Detection Type | Description | Action | Label |
---|---|---|---|
Moan recognition in audio | Detect voiceprint features in audio files, and recognize illegal features, such as moan | a-porn | normal: normal moan: Moan |
Sensitive word recognition in audio | Translate the text of audio files, and recognize sensitive and illegal contents | a-antispam | normal: normal terrorism: terrorism porn: porn illegal: illegal politics: politically sensitive contents abuse: abuse ad: advertisement cheating feudalism: feudalism religion: religiously sensitive contents affairs: Affairs contraband: contrabands minors: minors banned-website: banned websites |
Automatic speech recognition | Translate the text of audio files | a-asr | normal: normal |
Restriction Category | Description |
---|---|
Video format | Support .avi, .mp4, .asf, .wmv, and .mov. For other formats, contact customer support. |
Limit on file size | A single video does not exceed 200 MB. For larger videos, contact customer support. |
Screenshot interval | Screenshot every 2 seconds |
Concurrency restriction | You can submit up to 20 videos for moderation per second, and the system processes up to 200 videos concurrently. For a higher concurrency capacity, contact customer support. |
Save duration | The system will automatically save suspected illegal screenshots and audio clips, and return the file URL and detection result to the user. These files will be kept for 3 hours. Their URLs may become invalid over 3 hours. Export the files in time. |
Video resolution | At least 128 x 128. The extra-low resolution may affect the recognition effect. |
Area restriction | Only in Chinese mainland. For support in other countries and regions, contact customer support. |
Item | Description |
---|---|
Request method | POST |
Request protocol | HTTPS |
Request domain name | ai.jocloud.com |
Request path | /app/{appid}/v1/video/async?traceId=uuid-xxxx-xxxx-xxxx-xxxx |
Request parameters | traceId is a uuid string, and used for problem positioning during troubleshooting. It is suggested to use different values for each request. |
Request header | Content-Type: application/json;charset=UTF-8 token: authentication token; see its generation method in Identity Authentication |
Request body | json string, defined as follows |
The request parameter, as a json object, is stored in the request body. The specific field is described below:
Name | Type | Required | Description |
---|---|---|---|
actions | String array | Yes | Detection type, options including: v-porn: porn recognition v-terrorism: terrorism recognition v-antispam: sensitive information recognition v-sface: sensitive figure recognition v-illegal: illegal recognition v-ad: ad information recognition v-ocr: ocr information recognition Audio detection type, options including: a-porn: moan recognition a-antispam: sensitive word recognition a-asr: automatic speech recognition |
data[] | JSON array | Yes | Specify the detection object information list. Each element in the JSON array is a detection task structure (see Table "Request Data" below). A single callback can process up to 5 videos each time. |
callback | String | No | Result callback path, supporting HTTP/HTTPS callback. Allow nulls. When the callback address is null, you can obtain the detection result through search APIs (suggested to receive the moderation result through callback). |
sequence | String | No | This value is used for the signature in the callback notification request. This field is mandatory for callback. See details about the usage in the description on callback of detection results. |
Table: Request Data
Name | Type | Required | Description |
---|---|---|---|
dataId | String | Yes | Unique data ID, for example: uuid-xxxx-xxxx-xxxx-xxxx |
dataType | String | Yes | Data type. URL is mandatory. |
content | String | Yes | HTTP address of a video file to be detected |
extra | JSON | No | Extended parameters of audio; see the following table |
context | JSON | No | Customized context data, automatically provided when a result is returned |
Table: Extended parameters
Name | Type | Required | Description |
---|---|---|---|
lang | String | a-antispam detection language, default to Chinese, optional value: - chinese:Chinese - bahasa:bahasa | nil |
Name | Type | Required | Description |
---|---|---|---|
code | Integer | Yes | Error code, consistent with HTTP status code and also subject to extension, Error code in request |
message | String | Yes | Error message description |
traceId | String | Yes | Map to traceId in the request parameter |
requestId | String | Yes | The unique request ID generated by the system for this request, used for subsequent result callback and status query. |
timestamp | Integer | Yes | Current unix timestamp (s) |
Upon completion of detection, the system accesses the user provided callback address using HTTP POST, and returns the detection result to the user.
To prevent content tampering, add a checksum item to the header of HTTP request, to verify content validity.
The checksum string is generated by the following method:
The sequence + body string data contained in parameters of the starting task generate the checksum value through the SHA256 algorithm.
The detection result is saved in JSON structure in body, and the specific field is described as follows:
Name | Type | Required | Description |
---|---|---|---|
code | Integer | Yes | Error code, consistent with HTTP status code and also subject to extension, Error code in request |
message | String | Yes | Error message description |
traceId | String | Yes | traceId content in the pass-through request parameter |
requestId | String | Yes | The system generates a unique task ID specific to this detection request |
timestamp | Integer | Yes | Current unix timestamp (s) |
data[] | JSON array | No | Detection result data list (for specific structure, see the table 'Returned Data' below). Each item in the array represents a processing result of one data, and this field may be empty in case of errors. |
Table: returned data
Name | Type | Required | Description |
---|---|---|---|
code | Integer | Yes | Error code, Error code in data and action |
message | String | Yes | Error description |
dataId | String | Yes | Map to dataId in the request |
taskId | String | Yes | A unique task identifier generated for multiple detection types of this detection object |
context | JSON | No | Map to context in the request |
results[] | JSON array | No | Return the result data, and exist when the callback succeeds Elements included in the return result mapping to inputted actions. Each element is a structure, and represents the processing result of the mapping action The structures of the results for different actions are listed in the following table. |
Table: result
Name | Type | Required | Description |
---|---|---|---|
code | Integer | Yes | Error code, Error code in data and action |
message | String | Yes | Error description |
action | String | Yes | Detection type, mapping to parameters of request actions |
label | String | Yes | Detection result label. See the detection types and their labels above. |
rate | Floating-point number | Yes | Probability of detection result label, with the value ranging between [0.00 – 1.00]. The larger the value, the higher the credibility. |
suggestion | String | Yes | Operation recommended, with the value options: - pass: normal, no operation needed; - review: suspected, detection result uncertain, requiring further manual moderation - block: illegal, suggested to give punishment |
duration | Floating-point number | No | Return the length of detected audio as per the audio detection type |
text | String | No | Text content of audio clips, provided only when action is "a-antispam" or "a-asr". |
segment[] | JSON array | No | 'review' and 'block' video frame or audio segment identification result list. Different actions correspond to different segment parameters, see the definition of each action segment below for details |
The result structure of result->segment in different action detection is different, including the following situations:
(1)When action is v-porn or v-terrorism or v-antispam or v-sface or v-illegal or v-ad or v-ocr, the structure of result->segment as follow
Name | Type | Required | Description |
---|---|---|---|
label | String | Yes | Detection label, the one with the largest rate in all face labels |
rate | Floating-point number | Yes | Probability of detection result label, with the value ranging between [0.00 – 1.00]. The larger the value, the higher the credibility. |
suggestion | String | Yes | Suggested operation |
url | String | Yes | Screenshot address |
timeOffset | Floating-point number | Yes | Time from screenshot capturing to video start |
extraData[] | JSON array | No | This field only exists when action is 'v-porn' or 'v-sface'. For 'v-porn', the array saves the identified secondary label information, and the element structure is shown below porn table For 'v-sface', the array is saved the face information of all people recognized in the screenshot, the element structure is shown below face table |
Table: face
Name | Type | Required | Description |
---|---|---|---|
label | String | Yes | Detected face label |
rate | Floating-point number | Yes | Probability of detection result label, with the value ranging between [0.00 – 1.00]. The larger the value, the higher the credibility. |
name | String | Yes | Detected name of sensitive figure |
x | Integer | Yes | X-coordinate of the upper left corner of the detected face in the picture |
y | Integer | Yes | Y-coordinate of the upper left corner of the detected face in the picture |
w | Integer | Yes | Width of detected face |
h | Integer | Yes | Height of detected face |
Table. porn
Name | Type | Required | Description |
---|---|---|---|
label | String | Yes | secondary label |
rate | Floating-point number | Yes | Probability of detection result label, with the value ranging between [0.00 – 1.00]. The larger the value, the higher the credibility. |
(2)When action is a-porn, the structure of result->segment as follow
Name | Type | Required | Description |
---|---|---|---|
begin | Floating-point number | No | Start time of audio clips (s) |
end | Floating-point number | No | End time of audio clips (s) |
score | Floating-point number | No | Matching degree of moan, value ranging between:[0–100]. The higher the score, the higher the matching degree. |
(3)When action is a-antispam, the structure of result->segment as follow
Name | Type | Required | Description |
---|---|---|---|
begin | Floating-point number | No | Start time of audio clips (s) |
end | Floating-point number | No | End time of audio clips (s) |
extraData[] | JSON array | No | The sensitive word recognition result list of this audio segment, the element structure is shown below antispam-extraData table |
Table. antispam-extraData
Name | Type | Required | Description |
---|---|---|---|
hint | JSON array | No | Hit keyword |
label | String | No | Type of hit keyword |
rate | Floating-point number | No | Meaningless, always "1.0" |
Due to asynchronous processing, it is suggested to receive the processing result with the above asynchronous callback method. The results can be obtained through polling of synchronous APIs if necessary. The specific description is as below:
Item | Description |
---|---|
Request method | GET |
Request protocol | HTTPS |
Request domain name | ai.jocloud.com |
Request path | app/{appid}/v1/video/async/results?traceId=uuid-xxxx-xxxx-xxxx-xxxx&requestId=yyyy |
Request parameters | traceId is a uuid string, and used for problem positioning during troubleshooting. It is suggested to use different values for each request. requestId is the request ID to be searched, i.e. requestId carried in the return result of the task submitted for detection. |
Request header | Content-Type: application/json;charset=UTF-8 token: Authentication token. See its generation method in Identity Authentication |
Data in body is JSON, and the specific field is described as follows:
Name | Type | Required | Description |
---|---|---|---|
code | Integer | Yes | Error code, Error code in request |
message | String | Yes | Error message description |
traceId | String | Yes | traceId content in the pass-through request parameter |
requestId | String | Yes | requestId for the current search, consistent with the request parameter |
status | String | Yes | Task status (received-pending, processing-in progress, completed-done) |
timestamp | Integer | Yes | Current unix timestamp (s) |
data[] | Array | No | Return the result data. When it's successfully called (code==200), see the above table "returned data" for element definition. |
The following shows the sample code of calling with python:
# -*- coding: utf-8 -*-
#! python3.5
import requests
import uuid
import base64
host = "https://ai.jocloud.com"
appid = 123456789 # Your Service id
restful_id = '********************' # Your certificate ID
restful_secret = '********************' # Your certificate key
traceid = str(uuid.uuid4())
dataid = str(uuid.uuid4())
# url
url = host + '/app/%s/v1/video/async/submit?traceId=%s' % (appid, traceid)
# headers
headers = {
"content-type": "application/json"
}
auth = base64.b64encode(("%s:%s" % (restful_id, restful_secret)).encode('utf-8'))
headers['token'] = 'Base ' + auth.decode()
# The URL of the video file to be identified
file_url = 'http://newcntv.qcloudcdn.com/asp/hls/1200/0303000a/3/default/d67f1b655f0b49be87fdcc84f7f06029/7.ts'
# Context information to be used by the service to assist the subsequent treatment in the callback message, for example:
context = {
'myid': 123,
'myname': 'test'
}
# Identification result and callback address of status, and the identification result and status notificafication are called back through http POST
callback_addr = 'http://mydomain.com/callback'
# content
values = {
'actions': ['v-sface', 'v-porn'],
'data': [
{
'dataId': dataid,
'dataType': 'URL',
'content': file_url,
'extra': {},
'context': context
}
],
'callback': callback_addr,
'sequence': 'test'
}
# request
res = requests.post(url, json=values, headers=headers)
print ('url=%s\nbody=%s\ncode=%s\ndata=%s\n' % (url, values, res.status_code, res.text))
Version | Time | Description |
---|---|---|
v2.2.3 | 2020-10-15 | Add new label in OCR recognition |
V2.2.2 | 2020-10-14 | Add 'a-asr' action support |
v2.2.1 | 2020-10-13 | Add 'inbed' label in Illegal recognition |
v2.2.0 | 2020-08-31 | Add ad and ocr recognition; Update labels of 'v-terrorism' and 'v-antispam' recognition |
v2.1.0 | 2020-08-24 | Add illegal recognition |
v2.0.0 | 2020-08-17 | Add secondary labels in the porn recognition |
v1.1.0 | 2020-06-30 | remove 'interval' and 'maxframes' of 'extra' params in start request params, fixed it to screenshot every 2 seconds |
V1.0.0 | 2020-03-13 | Initial version |