The text moderation aims to recognize malicious information in text, and give the moderation result and handling suggestions.
The caller submits text for moderation, and specifies the detection type. The server synchronously returns the calling result.
The currently available detection types include: sensitive word recognition.
Detection types and their labels are listed below:
Detection Type | Description | Action | Label |
---|---|---|---|
Sensitive word recognition | Recognize sensitive and illegal contents in text files | antispam | normal: normal terrorism: terrorism porn: porn illegal: illegal politics: politically sensitive contents abuse: abuse ad: advertisement cheating feudalism: feudalism religion: Religiously sensitive contents affairs: affairs contraband: contrabands minors: minors banned-website: banned websites |
Category | Description |
---|---|
Text content | Base64 coded text string in UTF-8 format |
Text size | Length of each text requested up to 1,000 characters |
Character format | Character format for moderation must be UTF-8 |
Language support | Chinese, English, Arabic and Bahasa Indonesia; for more language support, contact customer support. |
Timeout limit | Recommended that the user-side interface call timeout control is 5s. |
Concurrency restriction | Process up to 20 text files per second (20 QPS). For higher QPS concurrency, contact customer support. |
Area restriction | Only in Chinese mainland. For support in other countries and regions, contact the business support. |
Send a moderation request using HTTP POST
Item | Description |
---|---|
Request method | POST |
Request protocol | HTTPS |
Request domain name | ai.jocloud .com |
Request path | app/{appid}/v1/text/sync?traceId=uuid-xxxx-xxxx-xxxx-xxxx |
Request parameters | traceId is a uuid string, and used for problem positioning during troubleshooting. It is suggested to use different values for each request. |
Request header | Content-Type: application/json;charset=UTF-8 token: Authentication token. See its generation mode in Identity Authentication |
Request body | json string, defined as follows |
Table: body data structure
Name | Type | Required | Description |
---|---|---|---|
actions | String array | Yes | Detection types. Options including: -antispam |
data[] | JSON array | Yes | Specify the detection object information list. Each element in a JSON array is the description of a text (see Table "Request Data" below). A single request can process up to 10 pieces of texts, and each text contains no more than 1,000 characters. |
Table: request data
Name | Type | Required | Description |
---|---|---|---|
dataId | String | Yes | Unique object ID, for example: uuid-xxxx-xxxx-xxxx-xxxx |
dataType | String | Yes | Data type - URL: URL starting with HTTP/HTTPS - BASE64: Base64 coded string of UTF-8 text |
content | String | Yes | Text content to be detected If dataType is URL, enter the text URL If dataType is BASE64, enter base64 coded string of the text. |
extra | JSON | No | Additional configuration, defined in "extra" below |
context | JSON | No | Customized context data, automatically provided when a result is returned. |
Table: extra
Name | Type | Description |
---|---|---|
lang | String | Language for text detection, Chinese by default; options including: - chinese: Chinese - english: English - arabic: Arabic - bahasa: Bahasa Indonesia |
The response content is a json object, as defined below
Name | Type | Required | Description |
---|---|---|---|
code | Integer | Yes | Error code. See the error code description |
message | String | Yes | Error message description |
traceId | String | Yes | traceId content in the pass-through request parameter |
requestId | String | Yes | The system generates a unique task ID specific to this detection request |
timestamp | Integer | Yes | Current unix timestamp (s) |
data[] | JSON array | No | Data list of detection result (for specific structure, see the table "returned data" below). Each item in the array represents a processing result of one text, and this field may be empty in case of errors. |
Table: returned data
Name | Type | Required | Description |
---|---|---|---|
code | Integer | Yes | Error code. See the error code description |
message | String | Yes | Error description |
dataId | String | Yes | Map to dataId in the request |
taskId | String | Yes | A unique task identifier generated for multiple detection types of this detection object |
context | JSON | No | Map to context in the request |
results[] | Array | No | Return the result data. When the callback succeeds (code==200), the return result contains one or more elements. Each element is a structure, and the specific structure is shown in Table "result" below. |
Table: Result
Name | Type | Required | Description |
---|---|---|---|
action | String | Yes | Detection type, consistent with the detection type (actions) in the call request |
label | String | Yes | Detection result label; its value is related to action. For specific values, see above moderation types and corresponding label specification table |
rate | Floating-point number | Yes | Probability of detection result label, with the value ranging between [0.00 - 1.00]. The larger the value, the higher the probability of falling into this category. |
suggestion | String | Yes | Suggested operation, options including: - pass: normal, no operation required; - review: suspected, requiring further manual review - block: illegal, suggested to give punishment |
extraData[] | JSON array | No | Extension data, including the hit keywords and extension information (e.g. keyword types). See details in the table "antispam-extraData". |
Table: antispam-extraData
Name | Type | Required | Description |
---|---|---|---|
hint | Json array | No | Hit keyword |
label | String | No | Type of hit keyword |
rate | Floating-point number | No | Meaningless, always "1.0" |
The following shows the sample code of calling with python:
# -*- coding: utf-8 -*-
# ! python3.6
import requests
import uuid
import base64
host = "https://ai.jocloud.com"
appid = 123456789 # Your Service id
restful_id = '********************' # Your certificate ID
restful_secret = '********************' # Your certificate key
traceid = str(uuid.uuid4())
dataid = str(uuid.uuid4())
# url
url = host + '/app/%s/v1/text/sync?traceId=%s' % (appid, traceid)
# headers
headers = {
"content-type": "application/json"
}
auth = base64.b64encode(("%s:%s" % (restful_id, restful_secret)).encode('utf-8'))
headers['token'] = 'Base ' + auth.decode()
text = "Welcome to the text audit interface!!"
# content
values = {
'actions': ['antispam'],
'data': [
{
'dataId': dataid,
'dataType': 'BASE64',
'content': base64.b64encode(text.encode('utf-8')).decode(),
'extra': {'lang': 'chinese'},
'context': {'uid': 12345, 'sid': 3467}
}
]
}
# request
res = requests.post(url, json=values, headers=headers)
print('code=%s, data=%s\n' % (res.status_code, res.text))
Response content
{
"code": 200,
"message": "OK",
"traceId": "6b9e3020-8e6d-48aa-92ff-34ffd14c4ae1",
"requestId": "f42b1ef4-4e39-41c0-9b39-497e27b8b8cf",
"timestamp": 1584071473,
"data": [
{
"code": 200,
"message": "OK",
"dataId": "714610f2-665d-459c-8880-98553ca2b4ca",
"taskId": "c5f6fa3a-6af5-4d7b-bbe9-3af6a11c0ca8",
"context": {
"uid": 12345,
"sid": 3467
},
"results": [
{
"action": "antispam",
"code": 200,
"extraData": [],
"label": "normal",
"message": "OK",
"rate": 1,
"suggestion": "pass"
}
]
}
]
}
Version | Time | Description |
---|---|---|
V1.0.1 | 2020-07-24 | Add Bahasa Indonesia language support |
V1.0.0 | 2020-03-13 | First version |