Text Moderation

The text moderation aims to recognize malicious information in text, and give the moderation result and handling suggestions.

1. Introduction

The caller submits text for moderation, and specifies the detection type. The server synchronously returns the calling result.
The currently available detection types include: sensitive word recognition.

Detection types and their labels are listed below:

Detection Type	Description	Action	Label
Sensitive word recognition	Recognize sensitive and illegal contents in text files	antispam	normal: normal terrorism: terrorism porn: porn illegal: illegal politics: politically sensitive contents abuse: abuse ad: advertisement cheating feudalism: feudalism religion: Religiously sensitive contents affairs: affairs contraband: contrabands minors: minors banned-website: banned websites

2. Restrictions

Category	Description
Text content	Base64 coded text string in UTF-8 format
Text size	Length of each text requested up to 1,000 characters
Character format	Character format for moderation must be UTF-8
Language support	Chinese, English, Arabic and Bahasa Indonesia; for more language support, contact customer support.
Timeout limit	Recommended that the user-side interface call timeout control is 5s.
Concurrency restriction	Process up to 20 text files per second (20 QPS). For higher QPS concurrency, contact customer support.
Area restriction	Only in Chinese mainland. For support in other countries and regions, contact the business support.

3. API

3.1 Initiate a Request

Send a moderation request using HTTP POST

Item	Description
Request method	POST
Request protocol	HTTPS
Request domain name	ai.jocloud .com
Request path	app/{appid}/v1/text/sync?traceId=uuid-xxxx-xxxx-xxxx-xxxx
Request parameters	traceId is a uuid string, and used for problem positioning during troubleshooting. It is suggested to use different values for each request.
Request header	Content-Type: application/json;charset=UTF-8 token: Authentication token. See its generation mode in Identity Authentication
Request body	json string, defined as follows

Table: body data structure

Name	Type	Required	Description
actions	String array	Yes	Detection types. Options including: -antispam
data[]	JSON array	Yes	Specify the detection object information list. Each element in a JSON array is the description of a text (see Table "Request Data" below). A single request can process up to 10 pieces of texts, and each text contains no more than 1,000 characters.

Table: request data

Name	Type	Required	Description
dataId	String	Yes	Unique object ID, for example: uuid-xxxx-xxxx-xxxx-xxxx
dataType	String	Yes	Data type - URL: URL starting with HTTP/HTTPS - BASE64: Base64 coded string of UTF-8 text
content	String	Yes	Text content to be detected If dataType is URL, enter the text URL If dataType is BASE64, enter base64 coded string of the text.
extra	JSON	No	Additional configuration, defined in "extra" below
context	JSON	No	Customized context data, automatically provided when a result is returned.

Table: extra

Name	Type	Description
lang	String	Language for text detection, Chinese by default; options including: - chinese: Chinese - english: English - arabic: Arabic - bahasa: Bahasa Indonesia

3.2 Response

The response content is a json object, as defined below

Name	Type	Required	Description
code	Integer	Yes	Error code. See the error code description
message	String	Yes	Error message description
traceId	String	Yes	traceId content in the pass-through request parameter
requestId	String	Yes	The system generates a unique task ID specific to this detection request
timestamp	Integer	Yes	Current unix timestamp (s)
data[]	JSON array	No	Data list of detection result (for specific structure, see the table "returned data" below). Each item in the array represents a processing result of one text, and this field may be empty in case of errors.

Table: returned data

Name	Type	Required	Description
code	Integer	Yes	Error code. See the error code description
message	String	Yes	Error description
dataId	String	Yes	Map to dataId in the request
taskId	String	Yes	A unique task identifier generated for multiple detection types of this detection object
context	JSON	No	Map to context in the request
results[]	Array	No	Return the result data. When the callback succeeds (code==200), the return result contains one or more elements. Each element is a structure, and the specific structure is shown in Table "result" below.

Table: Result

Name	Type	Required	Description
action	String	Yes	Detection type, consistent with the detection type (actions) in the call request
label	String	Yes	Detection result label; its value is related to action. For specific values, see above moderation types and corresponding label specification table
rate	Floating-point number	Yes	Probability of detection result label, with the value ranging between [0.00 - 1.00]. The larger the value, the higher the probability of falling into this category.
suggestion	String	Yes	Suggested operation, options including: - pass: normal, no operation required; - review: suspected, requiring further manual review - block: illegal, suggested to give punishment
extraData[]	JSON array	No	Extension data, including the hit keywords and extension information (e.g. keyword types). See details in the table "antispam-extraData".

Table: antispam-extraData

Name	Type	Required	Description
hint	Json array	No	Hit keyword
label	String	No	Type of hit keyword
rate	Floating-point number	No	Meaningless, always "1.0"

4. Sample Code

The following shows the sample code of calling with python:

# -*- coding: utf-8 -*-
# ! python3.6

import requests
import uuid
import base64

host = "https://ai.jocloud.com"

appid = 123456789  # Your Service id
restful_id = '********************'  # Your certificate ID
restful_secret = '********************'  # Your certificate key
traceid = str(uuid.uuid4())
dataid = str(uuid.uuid4())

# url
url = host + '/app/%s/v1/text/sync?traceId=%s' % (appid, traceid)

# headers
headers = {
    "content-type": "application/json"
}

auth = base64.b64encode(("%s:%s" % (restful_id, restful_secret)).encode('utf-8'))
headers['token'] = 'Base ' + auth.decode()

text = "Welcome to the text audit interface！！"

# content
values = {
    'actions': ['antispam'],
    'data': [
        {
            'dataId': dataid,
            'dataType': 'BASE64',
            'content': base64.b64encode(text.encode('utf-8')).decode(),
            'extra': {'lang': 'chinese'},
            'context': {'uid': 12345, 'sid': 3467}
        }
    ]
}

# request
res = requests.post(url, json=values, headers=headers)
print('code=%s, data=%s\n' % (res.status_code, res.text))

Response content

{
  "code": 200,
  "message": "OK",
  "traceId": "6b9e3020-8e6d-48aa-92ff-34ffd14c4ae1",
  "requestId": "f42b1ef4-4e39-41c0-9b39-497e27b8b8cf",
  "timestamp": 1584071473,
  "data": [
    {
      "code": 200,
      "message": "OK",
      "dataId": "714610f2-665d-459c-8880-98553ca2b4ca",
      "taskId": "c5f6fa3a-6af5-4d7b-bbe9-3af6a11c0ca8",
      "context": {
        "uid": 12345,
        "sid": 3467
      },
      "results": [
        {
          "action": "antispam",
          "code": 200,
          "extraData": [],
          "label": "normal",
          "message": "OK",
          "rate": 1,
          "suggestion": "pass"
        }
      ]
    }
  ]
}

5. Update History

Version	Time	Description
V1.0.1	2020-07-24	Add Bahasa Indonesia language support
V1.0.0	2020-03-13	First version