Asynchronous Audio Moderation

Audio moderation aims to identify malicious information in audio contents, and give the moderation result and handling suggestions.

1. Introduction

The caller submits an audio clip for moderation, and specifies the detection type. The server processes the data and returns the detection result by callback.
Compared with the synchronous moderation API, the asynchronous API can receive more processing data, thus consuming more time.
The currently available detection types include: moan recognition and sensitive word recognition.

Detection types and their labels are listed below:

Detection TypeDescriptionActionLabel
Moan recognitionDetect voiceprint features in audio files, and recognize illegal features, such as moanpornnormal: normal
moan: moan
Sensitive word recognitionTranslate the text of audio files, and recognize sensitive and illegal contentsantispamnormal: normal
terrorism: terrorism
porn: porn
illegal: illegal
politics: politically sensitive contents
abuse: abuse
ad: advertisement cheating
feudalism: feudalism
religion: religiously sensitive contents
affairs: affairs
contraband: contraband
minors: minors
banned-website: banned websites
Automatic speech recognitionTranslate the text of audio filesasrnormal: normal

2. Restrictions

Restriction CategoryDescription
File sourceAudio URL starting with HTTP/HTTPS
File durationThe maximum duration is 5 minutes by default. For Longer duration of audio files, contact customer support.
File sizeNo more than 50MB for a single audio file
File formatSupport aac, mp3, m4a, and mp4. For more formats, contact customer support.
Language supportSupport sensitive word recognition (Chinese and Bahasa Indonesia). For more language support, contact customer support.
Concurrency restrictionYou can submit up to 20 audio clips for moderation per second, and the system processes up to 200 audio clips concurrently. For a higher concurrency, contact customer support.
Area restrictionOnly in Chinese mainland. For support in other countries and regions, contact customer support.
Result cacheAfter the server complete processing, the result will be cached for 2 hours for the caller's search, and automatically removed after 2 hours.

3. API

3.1 Submit for Detection

Send a moderation request using HTTP POST

ItemDescription
Request methodPOST
Request protocol HTTPS
Request domain nameai.jocloud.com
Request pathapp/{appid}/v1/audio/async/submit?traceId=uuid-xxxx-xxxx-xxxx-xxxx
Request parameterstraceId is a uuid string, and used for problem positioning during troubleshooting. It is suggested to use different values for each request.
Request headerContent-Type: application/json;charset=UTF-8
token: authentication token; see its generation method in Identity Authentication
Request Bodyjson character string, defined as follows

Table: body data structure

NameTypeRequiredDescription
actionsString arrayYesDetection type. Options include:
- porn: moan recognition
- antispam: sensitive word recognition
- asr: automatic speech recognition
data[]JSON arrayYesSpecify the detection object information list. Each element in the JSON array is a sound detection object structure (see the request data table below).
A single request can process up to 5 audio clips.
callbackStringNoResult callback path, supporting HTTP/HTTPS callback. Allow null. When it is null, you can obtain the detection result through search APIs.
sequenceStringNoThis value is used for the signature in the callback notification request. This field is mandatory for callback. See details about the application method in the description on callback of detection results.

Table: request data

NameTypeRequiredDescription
dataIdStringYesObject unique identifier, for example: uuid-xxxx-xxxx-xxxx-xxxx
dataTypeStringYesData type
- URL: URL starting with HTTP/HTTPS
contentStringYesURL of an audio clip to be detected
contextJSONNoCustomized context data, automatically provided when a result is returned.
extraJSONNoExtra configure. See the extra table below.

Table: extra

NameTypeDescription
langStringLanguage of the audio clip.
-chinese: Chinese
-bahasa: Bahasa Indonesia

Response

The response content is a json object, as defined below

NameTypeRequiredDescription
codeIntegerYesError code. See the description of error codes below
messageStringYesError Message description
traceIdStringYesMap to traceId in the request parameter
requestIdStringYesThe unique request ID generated by the system for this request, Used for Subsequent Result Callback and Status Search.
timestampIntegerYesCurrent unix timestamp (s)

3.2 Call Back the Result

Upon completion of processing, the server will return the result to the caller through the callback address entered for the call.
Callback Method : HTTPS post
Callback Path : the callback address entered when submitting for detection
Callback Header : add a checksum to the header to verify the content validity, so as to prevent result tampering.

The checksum string is generated by the following method:

When submitting for detection, combine parameter "subsequence" and result "body" into a string, and generate checksum with the SHA256 algorithm.

Callback Content : HTTP body is a json object, defined as follows

NameTypeRequiredDescription
codeIntegerYesError code. See the description of error codes below
messageStringYesError message description
traceIdStringYestraceId content in the pass-through request parameter
requestIdStringYesThe system generates a unique task identifier specific to this detection request
timestampIntegerYesCurrent unix timestamp (s)
data[]JSON arrayNoDetection result data list (for specific structure, see the table of returned data below). Each item in the array represents a processing result of one data, and this field may be empty in case of errors.

Table: data
NameTypeRequiredDescription
codeIntegerYesError code. See the description of error codes, Error code in data and action
messageStringYesDescription of errors
dataIdStringYesMap to dataId in the request
taskIdStringYesA unique task identifier generated for multiple detection types of this detection object
contextJSONNoMap to context in the request
results[]JSON arrayNoReturn the result data. When the callback succeeds (code==200), the return result contains one or more elements. Each element represents the processing result of one action, and its specific structure is shown in the "result" table below.

Table: result
NameTypeRequiredDescription
codeIntegerYesError code. See the description of error codes, Error code in data and action
messageStringYesDescription of errors
actionStringYesDetection type, mapping to the detection type (actions) in the call request
labelStringYesDetection result label; its value is related to action. For specific values, see above moderation types and corresponding label specification table
rateFloating-point numberYesProbability of detection result label, with the value ranging between [0.00 – 1.00]. The larger the value, the higher the credibility.
suggestionStringYesOperation recommended, with the value options:
- pass: normal, requiring no operation;
- block: illegal, suggested to give punishment on illegal contents;
- review: suspected; the detection result is uncertain and requires further manual moderation.
durationFloating-point numberYesPlay duration of voice data
textStringNoContents of transliteration text
segment[]JSON arrayNo'review' and 'block' audio segment identification result list. Different actions correspond to different segment parameters, see the definition of each action segment below for details

Table: porn-segment
NameTypeRequiredDescription
beginFloating-point numberNoStart time of audio clip (s)
endFloating-point numberNoEnd time of audio clip (s)
scoreFloating-point numberNoMatching degree of moan, value ranging between:[0–100]. The higher the score, the higher the matching degree.

Table: antispam-segment
NameTypeRequiredDescription
beginFloating-point numberNoStart time of audio clips (s)
endFloating-point numberNoEnd time of audio clips (s)
extraData[]JSON arrayNoThe sensitive word recognition result list of this audio segment, the element structure is shown below antispam-extraData

Table: antispam-extraData

NameTypeRequiredDescription
hintJson arrayNoHit keyword
labelStringNoType of hit keyword
rateFloating-point numberNoMeaningless, always "1.0"

3.3 Search Result

Upon completion of processing, the server will call back using the callback method. It is recommended to receive the processing result with the callback method. If necessary, the caller can obtain the processing status and result via searching following APIs.

ItemDescription
Request methodGET
Request protocol HTTPS
Request domain Nameai.jocloud.com
Request pathapp/{appid}/v1/audio/async/results?traceId=uuid-xxxx-xxxx-xxxx-xxxx&requestId=yyyy
Request parameterstraceId is a uuid string, used for problem positioning during troubleshooting. It is suggested to use different values for each request.
requestId is the request ID to be searched, i.e. the requestId returned in the return result of the task submitted for detection.
Request headerContent-Type: application/json;charset=UTF-8
token: Authentication token; see its generation method in Identity Authentication

Search Response

The response content is a json object, as defined below

NameTypeRequiredDescription
codeIntegerYesError code, consistent with HTTP status code and also subject to extension
- 2xx indicates success
- 4xx indicates request error
- 500 indicates server error
For specific values, see related descriptions.
messageStringYesError message description
traceIdStringYestraceId content in the pass-through request parameter
requestIdStringYesRequestId for the current search, consistent with the request parameter
statusStringYesTask status (received-pending, processing-in progress, completed-done)
timestampIntegerYesCurrent unix timestamp (s)
data[]ArrayNoReturn the result data. When it's successfully called (code==200), see the above Table: data for element definition.

4. Sample Code

4.1 Initiate a Detection Request

# -*- coding: utf-8 -*-
#! python3.5

import requests
import uuid
import base64

host = "https://ai.jocloud.com"

appid = 123456789                         # Your service ID
restful_id = '********************'       # Your certificate ID
restful_secret = '********************'   # Your certificate key
traceid = str(uuid.uuid4())
dataid = str(uuid.uuid4())

# url
url = host + '/app/%s/v1/audio/async/submit?traceId=%s' % (appid, traceid)
callback = 'http://127.0.0.1/check?dataid=%s' % dataid

# headers
headers = {
    "content-type": "application/json"
}

auth = base64.b64encode(("%s:%s" % (restful_id, restful_secret)).encode('utf-8'))
headers['token'] = 'Base ' + auth.decode()

# content
values = {
    'actions': ['porn'],
    'data': [
        {
            'dataType': 'URL',
            'content': 'http://static.s3.huajiao.com/Object.access/hj-video/NjJhYWU0OTZmOTY5ZDZkM2UxZDBjZDE0MWNkMjljMDcubXAz',
            'dataId': dataid,
            'context': {'uid': 12345}
        }
    ],
    'callback': callback,
    'sequence': 'test'
}

# request
res = requests.post(url, json=values, headers=headers)
print('code=%s, data=%s\n' % (res.status_code, res.text))

Respond to a request

{
  "code": 200,
  "message": "OK",
  "traceId": "ff1ec05d-46e7-4235-9c46-cd44b684f043",
  "requestId": "d9f3171b-bd1f-48b1-b663-b232e4ed156b",
  "timestamp": 1584088487
}

4.2 Search the Result

Initiate a search

# -*- coding: utf-8 -*-
#! python3.5

import requests
import uuid
import base64

host = "https://ai.jocloud.com"

appid = 123456789                         # Your service ID
restful_id = '********************'       # Your certificate ID
restful_secret = '********************'   # Your certificate key
requestId = 'd9f3171b-bd1f-48b1-b663-b232e4ed156b'# Fill in the requestId carried in the Response of the detection request
traceid = str(uuid.uuid4())
dataid = str(uuid.uuid4())
# url
url = host + '/app/%s/v1/audio/async/results?traceId=%s&requestId=%s' % (appid, traceid, requestId)

# headers
headers = {
    "content-type": "application/json"
}
auth = base64.b64encode(("%s:%s" % (restful_id, restful_secret)).encode('utf-8'))
headers['token'] = 'Base ' + auth.decode()

# request
res = requests.get(url, headers=headers)
print('code=%s, data=%s\n' % (res.status_code, res.text))

Respond to a request

{
  "code": 200,
  "data": [
    {
      "code": 200,
      "message": "OK",
      "context": { "uid": 12345 },
      "dataId": "69c5b065-d0bf-47c5-84cc-252f6dbf6979",
      "results": [
        {
          "code": 200,
          "message": "OK",
          "action": "porn",
          "label": "moan",
          "rate": 0.7400000095367432,
          "suggestion": "review",
          "duration": 10,
          "segment": [
            {
              "begin": 0,
              "end": 10,
              "score": 74
            }
          ]
        }
      ],
      "taskId": "36c1f70f-89ae-47c3-ba97-40b0a82af7d7"
    }
  ],
  "message": "OK",
  "requestId": "d9f3171b-bd1f-48b1-b663-b232e4ed156b",
  "status": "completed",
  "timestamp": 1584088487,
  "traceId": "f257b637-692c-4033-85aa-ad2ebbc50a89"
}

5. Update History

VersionTimeDescription
V1.1.02020-10-14Add 'asr' action support
V1.0.12020-07-24Add Bahasa Indonesia language support
V1.0.02020-03-13Initial version

Was this page helpful?

Helpful Not helpful
Submitted! Your feedback would help us improve the website.
Feedback
Top