Text Moderation

The text moderation aims to recognize malicious information in text, and give the moderation result and handling suggestions.

1. Introduction

The caller submits text for moderation, and specifies the detection type. The server synchronously returns the calling result.
The currently available detection types include: sensitive word recognition.

Detection types and their labels are listed below:

Detection TypeDescriptionActionLabel
Sensitive word recognitionRecognize sensitive and illegal contents in text filesantispamnormal: normal
terrorism: terrorism
porn: porn
illegal: illegal
politics: politically sensitive contents
abuse: abuse
ad: advertisement cheating
feudalism: feudalism
religion: Religiously sensitive contents
affairs:
affairs
contraband: contrabands
minors: minors
banned-website: banned
websites

2. Restrictions

CategoryDescription
Text contentBase64 coded text string in UTF-8 format
Text sizeLength of each text requested up to 1,000 characters
Character formatCharacter format for moderation must be UTF-8
Language supportChinese, English, and Arabic; for more language support, contact customer support.
Concurrency restrictionProcess up to 20 text files per second (20 QPS). For higher QPS concurrency, contact customer support.[]
Area restrictionOnly in Chinese mainland. For support in other countries and regions, contact the business support.

3. API Description

3.1 Initiate a Request

Send a moderation request using HTTP POST

ItemDescription
Request methodPOST
Request protocol HTTPS
Request domain nameai.jocloud .com
Request pathapp/{appid}/v1/text/sync?traceId=uuid-xxxx-xxxx-xxxx-xxxx
Request parameterstraceId is a uuid string, and used for problem positioning during troubleshooting. It is suggested to use different values for each request.
Request headerContent-Type: application/json;charset=UTF-8
token: Authentication token. See its generation mode in Identity Authentication
Request bodyjson string, defined as follows

Table: body data structure

NameTypeRequiredDescription
actionsString arrayYesDetection types. Options including:
-antispam
data[]JSON arrayYesSpecify the detection object information list. Each element in a JSON array is the description of a text (see Table "Request Data" below).
A single request can process up to 10 pieces of texts, and each text contains no more than 1,000 characters.

Table: request data

NameTypeRequiredDescription
dataIdStringYesUnique object ID, for example: uuid-xxxx-xxxx-xxxx-xxxx
dataTypeStringYesData type
- URL: URL starting with HTTP/HTTPS
- BASE64: Base64 coded string of UTF-8 text
contentStringYesText content to be detected
If dataType is URL, enter the text URL
If dataType is BASE64, enter base64 coded string of the text.
extraJSONNoAdditional configuration, defined in "extra" below
contextJSONNoCustomized context data, automatically provided when a result is returned.

Table: extra

NameTypeDescription
langStringLanguage for text detection, Chinese by default; options including:
- chinese: Chinese
- english: English
- arabic: Arabic

3.2 Response Result

The response content is a json object, as defined below

NameTypeRequiredDescription
codeIntegerYesError code. See the error code description
messageStringYesError message description
traceIdStringYestraceId content in the pass-through request parameter
requestIdStringYesThe system generates a unique task ID specific to this detection request
timestampIntegerYesCurrent unix timestamp (s)
data[]JSON arrayNoData list of detection result (for specific structure, see the table "returned data" below). Each item in the array represents a processing result of one text, and this field may be empty in case of errors.

Table: returned data

NameTypeRequiredDescription
codeIntegerYesError code. See the error code description
messageStringYesError description
dataIdStringYesMap to dataId in the request
taskIdStringYesA unique task identifier generated for multiple detection types of this detection object
contextJSONNoMap to context in the request
results[]ArrayNoReturn the result data. When the callback succeeds (code==200), the return result contains one or more elements. Each element is a structure, and the specific structure is shown in Table "result" below.

Table: Result

NameTypeRequiredDescription
actionStringYesDetection type, consistent with the detection type (actions) in the call request
labelStringYesDetection result label; its value is related to action. For specific values, see above moderation types and corresponding label specification table
rateFloating-point numberYesProbability of detection result label, with the value ranging between [0.00 - 1.00]. The larger the value, the higher the probability of falling into this category.
suggestionStringYesSuggested operation, options including:
- pass: normal, no operation required;
- review: suspected, requiring further manual review
- block: illegal, suggested to give punishment
extraData[]JSON arrayNoExtension data, including the hit keywords and extension information (e.g. keyword types). See details in the table "antispam-extraData".

Table: antispam-extraData

NameTypeRequiredDescription
hintJson arrayNoHit keyword
labelStringNoType of hit keyword
rateFloating-point numberNoMeaningless, always "1.0"

3.3 Error Code Description

The returned error code of json object is described as follows

Error CodeError Description
200Succeeded
401Request parsing error
402Download failed
501Internal processing error

4. Call Sample

The following shows the sample code of calling with python:

# -*- coding: utf-8 -*-
# ! python3.6

import requests
import uuid
import base64

host = "https://ai.jocloud.com"

appid = 123456789  # Your Service id
restful_id = '********************'  # Your certificate ID
restful_secret = '********************'  # Your certificate key
traceid = str(uuid.uuid4())
dataid = str(uuid.uuid4())

# url
url = host + '/app/%s/v1/text/sync?traceId=%s' % (appid, traceid)

# headers
headers = {
    "content-type": "application/json"
}

auth = base64.b64encode(("%s:%s" % (restful_id, restful_secret)).encode('utf-8'))
headers['token'] = 'Base ' + auth.decode()

text = "Welcome to the text audit interface!!"

# content
values = {
    'actions': ['antispam'],
    'data': [
        {
            'dataId': dataid,
            'dataType': 'BASE64',
            'content': base64.b64encode(text.encode('utf-8')).decode(),
            'extra': {'lang': 'chinese'},
            'context': {'uid': 12345, 'sid': 3467}
        }
    ]
}

# request
res = requests.post(url, json=values, headers=headers)
print('code=%s, data=%s\n' % (res.status_code, res.text))

Response content

{
  "code": 200,
  "message": "OK",
  "traceId": "6b9e3020-8e6d-48aa-92ff-34ffd14c4ae1",
  "requestId": "f42b1ef4-4e39-41c0-9b39-497e27b8b8cf",
  "timestamp": 1584071473,
  "data": [
    {
      "code": 200,
      "message": "OK",
      "dataId": "714610f2-665d-459c-8880-98553ca2b4ca",
      "taskId": "c5f6fa3a-6af5-4d7b-bbe9-3af6a11c0ca8",
      "context": {
        "uid": 12345,
        "sid": 3467
      },
      "results": [
        {
          "action": "antispam",
          "code": 200,
          "extraData": [],
          "label": "normal",
          "message": "OK",
          "rate": 1,
          "suggestion": "pass"
        }
      ]
    }
  ]
}

5. Version Description

VersionTimeDescription
V1.0.02020-03-13First version
<