版本：V3.5

API接口文档

域名：https://cloud.laiye.com/idp
请求协议：https
数据格式：json

验证方式

4.1 应用签名验证

HTTP Header:

header key	描述
Api-Auth-pubkey	用户创建应用的 pubkey
Api-Auth-timestamp	当前时间戳（秒）
Api-Auth-nonce	随机字符串
Api-Auth-sign	签名(signature)（签名生成规则：(Api-Auth-nonce+Api-Auth-timestamp+secret_key)的sha1值）

示例代码(python)

encoding:utf-8  
import requests  
import json  
import time  
import hashlib  
import random  
import string  
def GenerateHeader(api_auth_pubkey, app_auth_secretkey):  
    api_auth_timestamp = str(int(time.time()))  
    api_auth_nounce = "".join(random.sample(string.ascii_letters + string.digits, 10))  
    HeaderDict = dict()  
    prefix = "Value Exception"  
    if api_auth_pubkey == "":  
       raise Exception("{0}, the api_auth_pubkey must not be empty".format(prefix))  
    if app_auth_secretkey == "":  
       raise Exception("{0}, the app_auth_secret_key must not be empty".format(prefix))  
    prefix = "Type Exception"  
    if not isinstance(api_auth_pubkey, str):  
       raise Exception("{0}, the type of api_auth_pubkey must be string".format(prefix))  
    if not isinstance(app_auth_secretkey, str):  
       raise Exception("{0}, the type of app_auth_secretkey must be string".format(prefix))  
    HeaderDict["Api-Auth-nonce"] = api_auth_nounce  
    HeaderDict["Api-Auth-pubkey"] = api_auth_pubkey  
    HeaderDict["Api-Auth-timestamp"] = api_auth_timestamp  
    token_name = hashlib.sha1()  
    token_key = api_auth_nounce + api_auth_timestamp+app_auth_secretkey  
    token_name.update(token_key.encode("utf-8"))  
    HeaderDict["Api-Auth-sign"] = token_name.hexdigest()  
    return HeaderDict  

4.2 token验证

Http Header:

header key	描述
Api-Auth-access-token	账号oauth登录后得到的access-token

4.3 SDK资源

SDK下载地址

常见错误码

code	描述
0	正常
3	(参数错误, 具体返回的 Message 根据错误类型不同会有差异)
8	(资源耗尽, 比如请求体过大, 具体返回的 Message 根据错误类型不同会有差异)
10000	服务内部错误
10001	header解析错误
10002	签名验证失败
10003	参数不正确，应用不存在（
10004	分类器算法类型不匹配
10005	需要先更新分类器模型
10006	需要选择待识别的图片
10007	错误的文件类型
10008	格式不正确，只支持png,jpeg,jpg,bmp,tiff,pdf
10009	文件尺寸不正确，文件的长宽需要在15和4096像素之间
10010	处理超时
10011	账号配额不足
10012	token已失效
10013	无效的AI能力模块ID
10014	当前AI能力未开启
10015	调用频率超限
10016	应用类型不匹配，请选择正确的API
10017	不支持加密的PDF
10018	请求数据过大，请控制在10M以内
10019	不存在生效模板，无法进行模版识别
10020	未匹配到任何生效模板
10021	不存在已发布版本，无法进行抽取
10022	请求参数错误
10023	只支持UTF8编码

限流说明

为了安全性和响应效率，我们对开放平台接口做了调用频率的限流。每次调用，在返回的 Headers 对象中（Response Headers）中会给出以下三个参数：

X-Ratelimit-Remaining：当前时间窗口剩余请求；
X-Ratelimit-Reset：下次重置时间；
UTCX-Ratelimit-Limit：当前时间窗口最大限流次数。

限流规则：

AI能力	URI	限制规则	-
通用文字识别	/v1/mage/ocr/general	企业版:根据商务沟通确定免费版: 每分钟6次	--
通用表格识别	/v1/mage/ocr/table	企业版:根据商务沟通确定免费版: 每分钟6次	--
通用卡证识别	/v1/mage/ocr/license	企业版:根据商务沟通确定免费版: 每分钟6次	--
通用多票据识别	/v1/mage/ocr/bills	企业版:根据商务沟通确定免费版: 每分钟6次	--
模板识别	/v1/document/ocr/template	企业版:根据商务沟通确定免费版: 每分钟6次	--

通用文字识别

描述

检测和识别图片中的文本内容，主要用于各种类型文档的电子化，如合同等。基于深度学习，印刷体识别准确率可达99%。

请求说明

请求方式

HTTP method：POST

Request URI： /v1/mage/ocr/general

请求参数

参数	是否必选	类型	描述
with_struct_info		boolean	图片格式 ocr.ImageFileType format=2 [json_name='format']; 是否需要结构化的信息
with_char_info		boolean	是否需要返回单字信息
img_base64	required	Array of strings	图片二进制进行base64后的编码，多页文档按照顺序发送目前只支持单张

请求代码示例

{
  "with_struct_info": true,
  "with_char_info": true,
  "img_base64": [
    "string"
  ]
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	docUnderstandingOcrGeneralResultNew
+struct_content	object	ocrStructContent
++paragraph	Array of objects	识别结果的页面信息数组
+++content	string	识别的文本内容
+++paragraph_id	int	段落编号
++page	Array of objects	识别结果的段落信息数组
+++content	string	识别的文本内容
+++page_id	int	页面编号
++row	Array of objects	识别结果的行信息数组
+++content	string	识别的文本内容
+++row_id	int	行编号
+rotated_image_width	int	旋转后图像的宽度
+img_id	String	每个图片生成的唯一id
+image_angle	int	旋转角度
+items	Array of objects	ocr的识别结果
++probabilities	Array of objects	识别结果中单个文字的概率
+++char	string
+++probability	float
++positions	Array of objects	文本块坐标(左上角起，顺时针一周四角坐标形成的集合)
+++y	int
+++x	int
++content	String	文本块内容
++char_positions	Array of objects	每个文字的坐标数组，长度应等于content的长度
+++positions	Array of objects	文字坐标(左上角起，顺时针一周四角坐标形成的集合)
++++y	int
++++x	int
+rotated_image_height	int	旋转后图片的高度

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "struct_content": {
      "paragraph": [
        {
          "content": "string",
          "paragraph_id": 0
        }
      ],
      "page": [
        {
          "content": "string",
          "page_id": 0
        }
      ],
      "row": [
        {
          "content": "string",
          "row_id": 0
        }
      ]
    },
    "rotated_image_width": 0,
    "img_id": "string",
    "msg_id": "string",
    "image_angle": 0,
    "items": [
      {
        "probabilities": [
          {
            "char": "string",
            "probability": 0
          }
        ],
        "positions": [
          {
            "y": 0,
            "x": 0
          }
        ],
        "content": "string",
        "char_positions": [
          {
            "positions": [
              {
                "y": 0,
                "x": 0
              }
            ]
          }
        ],
      }
    ],
    "rotated_image_height": 0
  }
}

通用表格识别

接口描述

检测和识别图片中的表格，行与列以及单元格中的文本内容。主要用于识别包含表格的文档，如合同、账单等。支持有框线表格、无框线表格、含合并单元格表格等。

请求说明

请求方式

HTTP method：POST

Request URI： /v1/mage/ocr/table

请求参数

参数	是否必选	类型	描述
img_base64	required	Array of strings	图片二进制进行base64后的编码，多页文档按照顺序发送目前只支持单张

请求示例说明

{
  "with_struct_info": true,
  "with_raw_info": true,
  "img_base64": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	docUnderstandingOcrGeneralResultNew
+tables	Array of objects	表格的信息
++column	int	表格总列数
++cells	Array of objects	表格中单元格的信息
+++end_col	int	单元格终止列
+++start_row	int	单元格起始行（单元格在首行时，start_row=0，如果起始行和终止行都是0，说明进行所有列单元格合并）
+++positions	Array of objects	单元格坐标（左上角起，顺时针一周四角坐标形成的集合）
++++y	int
++++x	int
+++content	string	单元格的文本内容
+++end_row	int	单元格终止行
+++start_col	int	单元格起始列（单元格在首列时，start_col=0，如果起始列和终止列都是0，说明进行所有行单元格合并(The starting column of the cell
++table_id	int	表格编号，从0开始
++row	int	表格总行数
+rotated_image_width	int	旋转后图像的宽度
+img_id	string	每个图片生成的唯一id
+msg_id	string	请求唯一id
+image_angle	int	旋转角度
+items	Array of objects	非表格内的文字的信息
++probabilities	Array of objects	识别结果中单个文字的概率
+++char	string
+++probability	float
++positions	Array of objects	文本块坐标(左上角起，顺时针一周四角坐标形成的集合)
+++y	int
+++x	int
++content	string	文本块内容
++char_positions	Array of objects	每个文字的坐标数组，长度应等于content的长度
+++positions	Array of objects	文字坐标(左上角起，顺时针一周四角坐标形成的集合)
++++y	int
++++x	int
+rotated_image_height	int	旋转后图片的高度

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "tables": [
      {
        "column": 0,
        "cells": [
          {
            "end_col": 0,
            "start_row": 0,
            "positions": [
              {
                "y": 0,
                "x": 0
              }
            ],
            "content": "string",
            "end_row": 0,
            "start_col": 0
          }
        ],
        "table_id": 0,
        "row": 0
      }
    ],
    "rotated_image_width": 0,
    "img_id": "string",
    "msg_id": "string",
    "image_angle": 0,
    "items": [
      {
        "probabilities": [
          {
            "char": "string",
            "probability": 0
          }
        ],
        "positions": [
          {
            "y": 0,
            "x": 0
          }
        ],
        "content": "string",
        "char_positions": [
          {
            "positions": [
              {
                "y": 0,
                "x": 0
              }
            ]
          }
        ],
      }
    ],
    "rotated_image_height": 0
  }
}

自定义模板识别

接口描述

根据固定的规则从半结构化数据中提取信息

请求说明

请求方式

HTTP method：POST

Request URI： /v1/document/ocr/template

请求参数

参数	是否必选	类型	描述
with_struct_info		boolean	是否返回结构化信息（目前只支持通用文字），默认false
with_raw_info		boolean	是否返回引擎识别原始信息，默认false
img_base64	required	string	图片base64

请求代码示例

{
  "with_struct_info": true,
  "with_raw_info": true,
  "img_base64": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	自定义模板识别数据
+update_time	int	模板更新时间，秒级时间戳
+template_hash	string	模板hash
+template_name	string	模板名称
+results	Array of objects	识别字段结果
++field_name	string	字段名称
++results	Array of strings	字段识别结果列表
+raw	object	自定义模板引擎识别原始结果
++tables	Array of objects	表格识别信息
+++column	int64	表格总列数
+++cells	Array of objects	表格中单元格的信息
++++end_col	int	单元格终止列
++++start_row	int	单元格起始行（单元格在首行时，start_row=0，如果起始行和终止行都是0，说明进行所有列单元格合并）
++++positions	Array of objects	单元格坐标（左上角起，顺时针一周四角坐标形成的集合）
+++++y	int
+++++x	int
++++content	string	单元格的文本内容
++++end_row	int	单元格终止行
++++start_col	int	单元格起始列（单元格在首列时，start_col=0，如果起始列和终止列都是0，说明进行所有行单元格合并
+++table_id	int	表格编号，从0开始
+++row	int	表格总行数
++struct_content	object	ocrStructContent
+++paragraph	Array of objects	识别结果的页面信息数组
++++content	string	识别的文本内容
++++paragraph_id	int	段落编号
+++page	Array of objects	识别结果的段落信息数组
++++content	string	识别的文本内容
++++page_id	int	页面编号
+++row	Array of objects	识别结果的行信息数组
++items	Array of objects	文本块识别信息
+++probabilities	Array of objects	识别结果中单个文字的概率
++++char	String
++++probability	float
+++positions	Array of objects	文本块坐标(左上角起，顺时针一周四角坐标形成的集合)
++++y	int
++++x	int
+++content	string	文本块内容
+++char_positions	Array of objects	每个文字的坐标数组，长度应等于content的长度
++++positions	Array of objects	文字坐标(左上角起，顺时针一周四角坐标形成的集合)
+++++y	int
+++++x	int
++rotated_image_width	int	旋转后的图像宽度
++image_angle	int	图片旋转角度
++rotated_image_height	int	旋转后的图像高度
+msgId	string	msgId

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "update_time": "string",
    "template_hash": "string",
    "template_name": "string",
    "results": [
      {
        "field_name": "string",
        "results": [
          "string"
        ]
      }
    ],
    "raw": {
      "tables": [
        {
          "column": 0,
          "cells": [
            {
              "end_col": 0,
              "start_row": 0,
              "positions": [
                {
                  "y": 0,
                  "x": 0
                }
              ],
              "content": "string",
              "end_row": 0,
              "start_col": 0
            }
          ],
          "table_id": 0,
          "row": 0
        }
      ],
      "struct_content": {
        "paragraph": [
          {
            "content": "string",
            "paragraph_id": 0
          }
        ],
        "page": [
          {
            "content": "string",
            "page_id": 0
          }
        ],
        "row": [
          {
            "content": "string",
            "row_id": 0
          }
        ]
      },
      "items": [
        {
          "probabilities": [
            {
              "char": "string",
              "probability": 0
            }
          ],
          "positions": [
            {
              "y": 0,
              "x": 0
            }
          ],
          "content": "string",
          "char_positions": [
            {
              "positions": [
                {
                  "y": 0,
                  "x": 0
                }
              ]
            }
          ],
        }
      ],
      "rotated_image_width": 0,
      "image_angle": 0,
      "rotated_image_height": 0
    },
    "msgId": "string"
  }
}

通用卡证识别

接口描述

检测和识别图片中的卡证信息。

请求说明

请求方式

HTTP method：POST

Request URI： /v1/mage/ocr/license

请求参数

字段	是否必选	类型	描述
img_base64	required	string	图片二进制进行base64后的编码

请求代码示例

{
  "img_base64": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	string	docUnderstandingOcrLicenseResultNew
+msg_id	string	请求唯一id
+result	object	ocrOcrLicenseResponse
++type_description	string	证照具体类型描述
++type_key	string	证照具体类型key
++rotated_image_width	int	旋转后图像的宽度
++img_id	string	每个图片生成的唯一id
++image_angle	int	旋转角度
++items	Array of objects	证照识别结果
+++positions	Array of objects	识别字段在原图中的坐标位置，顺时针排列，至少4个点
++++y	int
++++x	int
+++value	string	识别字段结果
+++key	string	识别字段类型
+++description	string	识别字段描述
++type	int	- 0: 默认 - 1: 银行卡 - 2: 名片 - 3: 香港身份证 - 4: 身份证 - 5: 社保卡 - 6: 驾驶证 - 7: 行驶证 - 8: 户口本 - 9: 护照 - 10: 结婚证 - 11: 离婚证 - 12: 房产证 - 13: 不动产证 - 14: 营业执照 - 15: 开户许可证 - 16: 税务登记证 - 17: 组织机构代码证 - 18: 车辆合格证 - 19: 车辆登记证 - 20: 其它 - 21: 往来港澳通行证 - 22: 往来台湾通行证 - 23: 承兑汇票 - 24: 马来西亚身份证 - 25: 新西兰驾驶证 - 26: 印度尼西亚居民身份证 - 27: 泰国身份证 - 28: 瑞典驾驶证 - 29: 马来西亚驾驶证 - 30: 菲律宾身份证 - 31: 新加坡驾驶证 - 32: 印度尼西亚驾驶证 - 33: 美国驾驶证。Default: "0"
Enum: "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33"
++rotated_image_height	int	旋转后图片的高度

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "msg_id": "string",
    "result": {
      "type_description": "string",
      "type_key": "string",
      "rotated_image_width": 0,
      "img_id": "string",
      "image_angle": 0,
      "items": [
        {
          "positions": [
            {
              "y": 0,
              "x": 0
            }
          ],
          "value": "string",
          "key": "string",
          "description": "string"
        }
      ],
      "type": "0",
      "rotated_image_height": 0
    }
  }
}

印章识别

接口描述

判断印章存在性，并识别印章颜色与性状

请求说明

请求方式

HTTP method：POST

Request URI：/v1/mage/ocr/stamp

请求参数

字段	是否必选	类型	描述
img_base64	required	string	图片二进制进行base64后的编码

请求代码示例

{
  "img_base64": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	印章识别结果
+img_id	string	每个图片生成的唯一id
+stamps	Array of objects	印章识别结果
++shape_description	string	形状描述
++confidence	float	置信度
++color_description	string	颜色描述
++color	string	颜色
++text	string	文字
++shape	string	形状
++positions	Array of objects	位置
+++y	int
+++x	int
+msg_id	string	请求唯一id

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "img_id": "string",
    "stamps": [
      {
        "shape_description": "string",
        "confidence": 0,
        "color_description": "string",
        "color": "string",
        "text": "string",
        "shape": "string",
        "positions": [
          {
            "y": 0,
            "x": 0
          }
        ]
      }
    ],
    "msg_id": "string"
  }
}

通用多票据识别

接口描述

检测和识别图片中的票据，支持单张图片中包含多张票据。

请求说明

请求方式

HTTP method：POST

Request URI： /v1/mage/ocr/bills

请求参数

字段	是否必选	类型	描述
img_base64	required	string	图片二进制进行base64后的编码(The image is encoded after base64)

请求代码示例

{
  "img_base64": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	docUnderstandingOcrBillsResultNew
+msg_id	string	请求唯一id
+result	Array of objects	识别结果
++kind	int	- 0: 默认 - 1: 用车 - 2: 交通 - 3: 医疗 - 4: 教育 - 5: 日用 - 6: 办公 - 7: 服务 - 8: 数码电器 - 9: 房租装饰 - 10: 通讯 - 11: 住宿 - 12: 邮寄 - 13: 餐饮 - 14: 食品 - 15: 服饰 - 16: 其他
++type_description	string	票据具体类型描述
++type_key	string	票据具体类型key
++rotated_image_width	int	旋转后图像的宽度
++items	Array of objects	票据识别结果
+++positions	Array of objects	识别字段在原图中的坐标位置，顺时针排列，至少4个点
++++y	int
++++x	int
+++value	string	识别字段结果
+++key	string	识别字段类型
+++description	string	识别字段描述
++rotated_image_height	int	旋转后图像的高度
++goods	Array of objects	票据中的表格明细部分
+++items	Array of objects	ocrOcrCommonItem
++++positions	Array of objects	识别字段在原图中的坐标位置，顺时针排列，至少4个点
+++++y	int
+++++x	int
++++value	string	识别字段结果
++++key	string	识别字段类型
++++description	string	识别字段描述
++image_angle	int	旋转角度
++type	int	- 0: 默认 - 1: 增值税专用发票 - 2: 机动车销售统一发票 - 3: 货物运输业增值税专用发票 - 4: 增值税普通发票 - 5: 增值税电子普通发票 - 6: 增值税普通发票（卷票） - 7: 增值税电子普通发票（通行费） - 8: 二手车销售统一发票 - 9: 通用机打发票 - 10: 通用定额发票 - 11: 旅客运输普票 - 12: 公路客运发票 - 13: 船运客票 - 14: 出租车发票 - 15: 停车费发票 - 16: 过路过桥费发票、汽车通行费 - 17: 医疗费收据 - 18: 教育费收据 - 19: 行程单 - 20: 火车票 - 21: 增值税销货清单 - 22: 商户小票 - 23: 其他 - 24: 英文票据。Default: "0"
Enum: "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24"
++class	int	- 0: 默认 - 1: 国税 - 2: 地方票种 - 3: 其他票种

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "msg_id": "string",
    "result": {
      "img_id": "string",
      "result": [
        {
          "kind": "0",
          "type_description": "string",
          "type_key": "string",
          "rotated_image_width": 0,
          "items": [
            {
              "positions": [
                {
                  "y": 0,
                  "x": 0
                }
              ],
              "value": "string",
              "key": "string",
              "description": "string"
            }
          ],
          "rotated_image_height": 0,
          "goods": [
            {
              "items": [
                {
                  "positions": [
                    {
                      "y": 0,
                      "x": 0
                    }
                  ],
                  "value": "string",
                  "key": "string",
                  "description": "string"
                }
              ]
            }
          ],
          "image_angle": 0,
          "type": "0",
          "class": "0"
        }
      ]
    }
  }
}

验证码识别

接口描述

通过图像识别OCR技术，检测和识别图片验证码中的数字和字母，针对特定网页验证码类型可以开箱即用。

请求说明

请求方式

HTTP method：POST

Request URI： /v1/document/ocr/verification

请求参数

字段	是否必选	类型	描述
format		int	图片格式
img_base64	required	string	图片二进制进行base64后的编码

请求代码示例

{
  "format": "0",
  "img_base64": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	验证码识别结果
+positions	Array of objects	识别区域左上角顶点坐标
++y	int
++x	int
+msg_id	string	请求唯一id
+result	string	结果

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "positions": [
      {
        "y": 0,
        "x": 0
      }
    ],
    "msg_id": "string",
    "result": "string"
  }
}

文档抽取

提交任务

文档抽取任务提交接口，上传1份文件，返回任务ID。

请求说明

请求方式

HTTP method：POST

Request URI：/v1/mage/nlp/docextract/create

请求参数

字段	是否必选	类型	描述
img_base64	required	string	图片二进制进行base64后的编码

请求代码示例

{
  "img_base64": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
task_id	string	任务ID

返回示例

{
  "message": "string",
  "code": 0,
  "task_id": "string"
}

获取结果

文档抽取获取结果接口，根据任务ID查询文档抽取的结果。

请求说明

请求方式

HTTP method：POST

Request URI：/v1/mage/nlp/docextract/query

请求参数

字段	是否必选	类型	描述
task_id	required	string	任务ID

请求代码示例

{
  "task_id": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	抽取结果
+status	int	文档抽取状态. Default: "0";0: 默认值; 1: 等待中; 2: 进行中; 3: 成功; 4: 失败
+type_description	string	文件类型描述
+task_id	string	任务id
+fields	Array of objects	抽取结果
++score	float	置信度
++values	Array of objects	抽取值
+++content	string	文字内容
+++positions	Array of objects	文字坐标组
++++positions	Array of objects	坐标
+++++position	Array of objects	(左上/右上/右下/左下 4 个点的坐标)
++++++x	int
++++++y	int
++++page_num	int	页码
+++is_intext	boolean	是否在原文中
++description	string	字段描述
++key	string	字段key
+recommend_structure	Array of objects	结构化建议结果
++column	int	表格总列数
++cells	Array of objects	表格中单元格的信息
+++end_col	int	单元格终止列
+++start_row	int	单元格起始行（单元格在首行时，start_row=0，如果起始行和终止行都是0，说明进行所有列单元格合并)
+++positions	Array of objects	单元格坐标（左上角起，顺时针一周四角坐标形成的集合）
++++y	int
++++x	int
+++content	string	单元格的文本内容
+++end_row	int	单元格终止行
+++start_col	int	单元格起始列（单元格在首列时，start_col=0，如果起始列和终止列都是0，说明进行所有行单元格合并
+++items	Array of objects	单元格内每个条目的信息
++++probabilities	Array of objects	识别结果中单个文字的概率
+++++char	string
+++++probability	float
++++positions	Array of objects	文本块坐标(左上角起，顺时针一周四角坐标形成的集合)
+++++y	int
+++++x	int
++++content	string	文本块内容
++++char_positions	Array of objects	每个文字的信息，包括坐标、文字内容、概率等
+++++positions	Array of objects	文字坐标(左上角起，顺时针一周四角坐标形成的集合)
++++++y	int
++++++x	int
+++++probability	float	文字概率
+++++text	string	文字内容
++++handwrite_info	object	条目是否是手写（私有部署定制模型，公有云不支持）
+++++handwrite_score	float	手写得分
+++++is_handwrite	boolean	是否手写
+++++print_score	float	打印得分
++++importance_info	object	条目是否是关键信息（私有部署定制模型，公有云不支持）
+++++is_importance	boolean	是否关键信息
+++++important_score	float	关键信息得分
+++++unimportant_score	float	非关键信息得分
++table_id	int	表格编号，从0开始
++row	int	表格总行数
+progress	int	完成百分比
+type_key	string	文件类型key
+pages	Array of objects	页面信息
++content	string	识别内容全文本
++image_angle	int	旋转角度(原始图片顺时针旋转多少度后摆正)
++page_num	int	页面编号

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "status": "0",
    "type_description": "string",
    "task_id": "string",
    "fields": [
      {
        "score": 0,
        "values": [
          {
            "content": "string",
            "positions": [
              {
                "positions": [
                  {
                    "position": [
                      {
                        "y": 0,
                        "x": 0
                      }
                    ]
                  }
                ],
                "page_num": 0
              }
            ],
            "is_intext": true
          }
        ],
        "description": "string",
        "key": "string"
      }
    ],
    "recommend_structure": [
      {
        "column": 0,
        "cells": [
          {
            "end_col": 0,
            "start_row": 0,
            "positions": [
              {
                "y": 0,
                "x": 0
              }
            ],
            "content": "string",
            "end_row": 0,
            "start_col": 0,
            "items": [
              {
                "probabilities": [
                  {
                    "char": "string",
                    "probability": 0
                  }
                ],
                "positions": [
                  {
                    "y": 0,
                    "x": 0
                  }
                ],
                "content": "string",
                "char_positions": [
                  {
                    "positions": [
                      {
                        "y": 0,
                        "x": 0
                      }
                    ],
                    "probability": 0,
                    "text": "string"
                  }
                ],
                "handwrite_info": {
                  "handwrite_score": 0,
                  "is_handwrite": true,
                  "print_score": 0
                },
                "importance_info": {
                  "is_importance": true,
                  "important_score": 0,
                  "unimportant_score": 0
                }
              }
            ]
          }
        ],
        "table_id": 0,
        "row": 0
      }
    ],
    "progress": 0,
    "type_key": "string",
    "pages": [
      {
        "content": "string",
        "image_angle": 0,
        "page_num": 0
      }
    ]
  }
}

版面分析

接口描述

版面分析服务

请求说明

请求方式

HTTP method：POST

Request URI：/v1/mage/ocr/layout

请求参数

字段	是否必选	类型	描述
img_base64	required	string	图片二进制进行base64后的编码

请求代码示例

{
  "img_base64": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	版面分析识别结果
+items	Array of objects	版面识别区域列表
++probabilities	Array of objects	识别结果中单个文字的概率
+++char	string
+++probability	float
++positions	Array of objects	文本块坐标(左上角起，顺时针一周四角坐标形成的集合)
+++y	int
+++x	int
++content	string	文本块内容
++char_positions	Array of objects	每个文字的信息，包括坐标、文字内容、概率等
+++positions	Array of objects	文字坐标(左上角起，顺时针一周四角坐标形成的集合)
++++y	int
++++x	int
+++probability	float	文字概率
+++text	string	文字内容
+score	float	置信度
+class_key	string	版面类型key
+class_no	int	['page_header', 'page_footer', 'picture', 'paragraph', 'list', 'seal', 'qr_code', 'handwritten_signature', 'logo', 'question', 'other', 'article_title', 'form_title', 'picture_title', 'class_title', 'full_form', 'no_form', 'half_form', 'form_annotation', 'page_annotation', 'picture_annotation'] Default: "0"
Enum: "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" 1: 页眉; 2: 页脚; 3: 图片; 4: 文本段落; 5: 列表; 6: 印章; 7: 二维码; 8: 手写签名; 9: logo; 10: 其他; 11: 文章标题; 12: 表格标题; 13: 图片标题; 14: 段落标题; 15: 有线表格; 16: 混合线框表格，; 17: 表格注释; 18: 页面注释; 19: 图片注释
msg_id	string	请求唯一id
image_angle	int	旋转角度
rotated_image_height	int	旋转后图片的高度
rotated_image_width	int	旋转后图片的宽度

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "items": [
      {
        "area_positions": [
          {
            "y": 0,
            "x": 0
          }
        ],
        "items": [
          {
            "probabilities": [
              {
                "char": "string",
                "probability": 0
              }
            ],
            "positions": [
              {
                "y": 0,
                "x": 0
              }
            ],
            "content": "string",
            "char_positions": [
              {
                "positions": [
                  {
                    "y": 0,
                    "x": 0
                  }
                ],
                "probability": 0,
                "text": "string"
              }
            ],
            "handwrite_info": {
              "handwrite_score": 0,
              "is_handwrite": true,
              "print_score": 0
            },
            "importance_info": {
              "is_importance": true,
              "important_score": 0,
              "unimportant_score": 0
            }
          }
        ],
        "score": 0,
        "class_key": "string",
        "class_no": "0"
      }
    ],
    "msg_id": "string",
    "image_angle": 0,
    "rotated_image_height": 0,
    "rotated_image_width": 0
  }
}

信息抽取

接口描述

文本信息抽取能力是来也自研的NLP能力，实现从文本中按照一定逻辑提取关键信息的过程。

请求说明

请求方式

HTTP method：POST

Request URI：/v1/document/extract

请求参数

字段	是否必选	类型	描述
doc	required	string	待抽取信息的文本

请求代码示例

{
  "doc": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	抽取返回数据
+update_time	int	版本更新时间，秒级时间戳
+msg_id	string	唯一id
+results	Array of objects	抽取结果
++name	string	名称
++unicode_length	int	unicode编码的长度
++template_hash	string	模板hash
++fields	Array of objects	抽取到的字段
+++std_value	string	归一化值，暂不启用
+++name	string	字段名
+++unicode_length	int	unicode编码的长度
+++unicode_start_pos	int	unicode编码的起始位置
+++value	string	文本值
+++length	int	在query中的长度，utf8编码
+++score	float	置信度
+++start_pos	int	在query中的起始位置，utf8编码
+++id	int	字段id
++length	int	在query中的长度，utf8编码
++unicode_start_pos	int	unicode编码的起始位置
++start_pos	int	在query中的起始位置，utf8编码
++id	int	模板id
++template_str	string	模板的文本
+debug_info	object	调试信息
++word_segments	Array of strings	分词结果
++match_failed_infos	Array of objects	模板失败原因
+++failed_msg	string	详细信息
+++type	int	- 0: 模板解析失败 - 1: 锚点没找到 - 2: 结果冲突 - 3: 模糊匹配失败 - 4: 未能匹配到开始 - 5: 未能匹配到结束； Default: "0"; Enum: "0" "1" "2" "3" "4" "5"
+++id	int	模板id
++regex_match_failed_infos	Array of objects	正则匹配失败信息
+++content	string	失败的内容
+++length	int	长度
+++regex_name	string	正则名称
+++start_pos	int	起始位置
++dict_match_failed_infos	Array of objects	词典匹配失败信息
+++length	int	长度
+++dict_name	string	词典名称
+++start_pos	int	起始位置
+++dict_word	string	失败的单词
+version_hash	string	版本hash

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "update_time": "string",
    "msg_id": "string",
    "results": [
      {
        "name": "string",
        "unicode_length": 0,
        "template_hash": "string",
        "fields": [
          {
            "std_value": "string",
            "name": "string",
            "unicode_length": 0,
            "unicode_start_pos": 0,
            "value": "string",
            "length": 0,
            "score": 0,
            "start_pos": 0,
            "id": 0
          }
        ],
        "length": 0,
        "unicode_start_pos": 0,
        "start_pos": 0,
        "id": 0,
        "template_str": "string"
      }
    ],
    "debug_info": {
      "word_segments": [
        "string"
      ],
      "match_failed_infos": [
        {
          "failed_msg": "string",
          "type": "0",
          "id": 0
        }
      ],
      "regex_match_failed_infos": [
        {
          "content": "string",
          "length": 0,
          "regex_name": "string",
          "start_pos": 0
        }
      ],
      "dict_match_failed_infos": [
        {
          "length": 0,
          "dict_name": "string",
          "start_pos": 0,
          "dict_word": "string"
        }
      ]
    },
    "version_hash": "string"
  }
}

标准地址

地址标准化服务。输入1个地址，系统返回该地址的标准值。具有弥补和纠错自然语言表达的非标准地址。

请求说明

请求方式

HTTP method：POST

Request URI：/v1/mage/nlp/geoextract

请求参数

字段	是否必选	类型	描述
text	required	string	文本

请求代码示例

{
  "text": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	地址标准化结果
msg_id	string	请求唯一id
geo_list	Array of objects	结果列表
+province	string	省（直辖市或自治区）
+city	string	市
+poi_type	string	poi类型
+district	string	区县/县级市
+subdistrict	string	街道
+length	int
+start_pos	int
+address	string	详细地址
+poi_name	string	名称

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "msg_id": "string",
    "geo_list": [
      {
        "province": "string",
        "city": "string",
        "poi_type": "string",
        "district": "string",
        "subdistrict": "string",
        "length": 0,
        "start_pos": 0,
        "address": "string",
        "poi_name": "string"
      }
    ]
  }
}

二维码识别

识别图片中二维码的编码信息

请求说明

请求方式

HTTP method：POST

Request URI：/v1/mage/ocr/barcode

请求参数

字段	是否必选	类型	描述
img_base64	required	string	图片二进制进行base64后的编码

请求代码示例

{
  "img_base64": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	二维码识别结果
+items	object	二维码列表
++positions	Array of objects	二维码区域坐标
+++y	int
+++x	int
++code_type	int	- 0: 缺省 - 1: 二维码 - 2: 条形码 Default: "0"; Enum: "0" "1" "2"
++text	string	二维码内容
+msg_id	string	请求唯一id
+image_angle	int	旋转角度
+rotated_image_height	int	旋转后图片的高度
+rotated_image_width	int	旋转后图片的宽度

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "items": [
      {
        "positions": [
          {
            "y": 0,
            "x": 0
          }
        ],
        "code_type": "0",
        "text": "string"
      }
    ],
    "msg_id": "string",
    "image_angle": 0,
    "rotated_image_height": 0,
    "rotated_image_width": 0
  }
}

文本分类

接口描述

文本分类，即将输入的文档根据一定用户所设定好的规则进行分类，方便用户整理同类文档。

请求说明

请求方式

HTTP method：POST

Request URI： /v1/document/classify

请求参数

字段	是否必选	类型	描述
doc	required	string	待进行分类的文本内容

请求代码示例

{
  "doc": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	分类返回数据
+msg_id	string	唯一id
+results	Array of objects	分类结果
++class_id	int	分类id
++score	float	置信度0-100
++class_label	string	分类名
++debug_info	Array of objects	调试信息
+++unicode_start_pos	int	unicode起始位置
+++start_pos	int	utf8编码起始位置
+++length	int	utf8编码长度
+++keyword	string	匹配关键字
+++unicode_length	int	unicode长度

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "msg_id": "string",
    "results": [
      {
        "class_id": 0,
        "score": 0,
        "class_label": "string",
        "debug_info": [
          {
            "unicode_start_pos": 0,
            "start_pos": 0,
            "length": 0,
            "keyword": "string",
            "unicode_length": 0
          }
        ]
      }
    ]
  }
}

合同比对

提交任务

接口描述

合同比对任务提交接口，上传2份待比对的文件，返回任务ID。

请求说明

请求方式

HTTP method：POST

Request URI： /v1/mage/solution/contract/compare

请求参数

字段	类型	描述
file_compare	string	比对文档 (base 64 编码)
file_base	string	标准文档 (base 64 编码)
file_compare_name	string	比对文档的文件名, 不含文件后缀 (选填, 如果不传, 平台以 "比对文档" 命名该文件)
file_base_name	string	标准文档的文件名, 不含文件后缀 (选填, 如果不传, 平台以 "标准文档" 命名该文件)

请求代码示例

{
  "file_compare": "string",
  "file_base": "string",
  "file_compare_name": "string",
  "file_base_name": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
task_id	string	比对任务 ID

返回示例

{
  "message": "string",
  "code": 0,
  "task_id": "string"
}

获取结果

合同比对获取结果接口，根据任务ID查询比对结果。

请求说明

请求方式

HTTP method：POST

Request URI： /v1/mage/solution/contract/detail

请求参数

字段	是否必选	类型	描述
task_id		string	比对任务 ID

请求代码示例

{
  "task_id": "string"
}

返回说明

返回示例

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	比对结果
+finished_page_num	int	当前已完成的页码, 页码从 1 开始
+result_ocr	Array of objects	OCR 识别结果(deprected)）
++file_compare	string	比对文档该页的 OCR 识别结果
++file_base	string	参考文档该页的 OCR 识别结果
++page_num	int	页码, 页码从 1 开始
+task_id	string	任务 ID
+total_page_num	int	总页数
+result_details_ignore	Array of objects	已忽略的差异
++diff_result	Array of objects	该页的 diff 结果
+++diff_index	int	diff 在参考文档中, 当前页面出现的顺序
+++content_compare	string	比对文档中的 diff 内容	Array of objects	比对文档中的 diff 的位置, Diff 可能跨页, 在同一页内也可能跨行, 对于既跨页又跨行的 Diff, 需先按页切割, 然后在一页内按行切割
++++position_list	Array of objects	该页上的 diff 区域
+++++position	Array of objects	左上/右上/右下/左下 4 个点的坐标
++++++y	int
++++++x	int
++++page_num	int	该页的页码 (页码从 1 开始)	Array of objects	参考文档中的 diff 的位置, Diff 可能跨页, 在同一页内也可能跨行, 对于既跨页又跨行的 Diff, 需先按页切割, 然后在一页内按行切割
++++position_list	Array of objects	该页上的 diff 区域
+++++position	Array of objects	左上/右上/右下/左下 4 个点的坐标
++++++y	int
++++++x	int
++++page_num	int	该页的页码 (页码从 1 开始)
+++diff_type	int	docUnderstandingDiffType. 0: 枚举默认值, 不使用;1: 修改类型（Modification type）;2: 删除类型;3: 插入类型
+++content_base	string	参考文档中的 diff 内容
++page_num	int	当前页码, 页码从 1 开始
+base_page_result	Array of objects	参考文档识别结
++content	string	识别内容全文本
++image_angle	int	旋转角度
++page_num	int64	页码, 页码从 1 开始
+base_page_num	int64	参考文档页数
+compare_page_num	int64	比对文档页数
+result_url	string	比对结果链接
+compare_page_result	Array of objects	比对文档识别结果
++content	string	识别内容全文本
++image_angle	int	旋转角度
++page_num	int	页码, 页码从 1 开始
+task_error_msg	string	当任务状态为比对失败时, 该字段指示失败的原因
+result_summary	object	比对整体结果
++diff_insert	int	插入 diff 总数
++diff_ignore	int	忽略的 diff 总数
++diff_page	int	存在 diff 的页码, 页码从 1 开始
++diff_sum	int	diff 总数
++diff_delete	int	删除 diff 总数
++diff_replace	int	修改 diff 总数
+task_status	int	docUnderstandingCompareTaskStatus. 0: 枚举默认值, 不使用;1: 等待中（waiting）;2: 比对中;3: 已停止; 4: 比对完成; 5: 比对失败
+result_details	Array of objects	详细比对结果, diff出现的页码、diff类型、diff在2个文档中的位置
++diff_result	Array of objects	该页的 diff 结果
+++diff_index	int	diff 在参考文档中, 当前页面出现的顺序
+++content_compare	string	比对文档中的 diff 内容
+++location_compare	Array of objects	比对文档中的 diff 的位置, Diff 可能跨页, 在同一页内也可能跨行, 对于既跨页又跨行的 Diff, 需先按页切割, 然后在一页内按行切割
++++position_list	Array of objects	该页上的 diff 区域
+++++position	Array of objects	左上/右上/右下/左下 4 个点的坐标
++++++y	int
++++++x	int
++++page_num	int	该页的页码 (页码从 1 开始)	Array of objects	参考文档中的 diff 的位置, Diff 可能跨页, 在同一页内也可能跨行, 对于既跨页又跨行的 Diff, 需先按页切割, 然后在一页内按行切割
++++position_list	Array of objects	该页上的 diff 区域
+++++position	Array of objects	左上/右上/右下/左下 4 个点的坐标
++++++y	int
++++++x	int
++++page_num	int	该页的页码 (页码从 1 开始)
+++diff_type	int	docUnderstandingDiffType. 0: 枚举默认值, 不使用;1: 修改类型;2: 删除类型(delete type);3: 插入类型
+++content_base	string	参考文档中的 diff 内容
++page_num	int	当前页码, 页码从 1 开始

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "finished_page_num": 0,
    "result_ocr": [
      {
        "file_compare": "string",
        "file_base": "string",
        "page_num": 0
      }
    ],
    "task_id": "string",
    "total_page_num": 0,
    "result_details_ignore": [
      {
        "diff_result": [
          {
            "diff_index": 0,
            "content_compare": "string",
            "location_compare": [
              {
                "position_list": [
                  {
                    "position": [
                      {
                        "y": 0,
                        "x": 0
                      }
                    ]
                  }
                ],
                "page_num": 0
              }
            ],
            "location_base": [
              {
                "position_list": [
                  {
                    "position": [
                      {
                        "y": 0,
                        "x": 0
                      }
                    ]
                  }
                ],
                "page_num": 0
              }
            ],
            "diff_type": "0",
            "content_base": "string"
          }
        ],
        "page_num": 0
      }
    ],
    "base_page_result": [
      {
        "content": "string",
        "image_angle": 0,
        "page_num": 0
      }
    ],
    "base_page_num": 0,
    "compare_page_num": 0,
    "result_url": "string",
    "compare_page_result": [
      {
        "content": "string",
        "image_angle": 0,
        "page_num": 0
      }
    ],
    "task_error_msg": "string",
    "result_summary": {
      "diff_insert": 0,
      "diff_ignore": 0,
      "diff_page": [
        0
      ],
      "diff_sum": 0,
      "diff_delete": 0,
      "diff_replace": 0
    },
    "task_status": "0",
    "result_details": [
      {
        "diff_result": [
          {
            "diff_index": 0,
            "content_compare": "string",
            "location_compare": [
              {
                "position_list": [
                  {
                    "position": [
                      {
                        "y": 0,
                        "x": 0
                      }
                    ]
                  }
                ],
                "page_num": 0
              }
            ],
            "location_base": [
              {
                "position_list": [
                  {
                    "position": [
                      {
                        "y": 0,
                        "x": 0
                      }
                    ]
                  }
                ],
                "page_num": 0
              }
            ],
            "diff_type": "0",
            "content_base": "string"
          }
        ],
        "page_num": 0
      }
    ]
  }
}

下载结果

合同比对下载结果接口，根据任务ID下载比对结果。

请求说明

请求方式

HTTP method：POST

Request URI： /v1/mage/solution/contract/files

请求参数

| 字段 | 是否必选 |类型描述 | | ------------ | --------------------------- | ------ | ---------------------------- | | task_id | | string | 比对任务 ID|

请求代码示例

{
  "task_id": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
link	string	比对结果链接

返回示例

{
  "message": "string",
  "code": 0,
  "link": "string"
}

文档自训练抽取

提交任务

文档自训练抽取模型任务提交接口，上传1份待分析的文件，返回任务ID。(多份标签任务模型待提交接口，上传1个分析的文件，返回ID。)

请求说明

请求方式

HTTP method：POST

Request URI： /v1/mage/idp/extractor/create

请求参数

字段	是否必选	类型	描述
file_base64	required	string	(文件二进制进行base64后的编码)

请求代码示例

{
  "file_base64": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
task_id	string	任务ID

返回示例

{
  "message": "string",
  "code": 0,
  "task_id": "string"
}

获取结果

多页抽取模型获取结果接口，根据任务ID查询抽取结果。

请求说明

请求方式

HTTP method：POST

Request URI：/v1/mage/idp/extractor/query

请求参数

字段	是否必选	类型	描述
task_id	required	string	任务id

请求代码示例

{
  "task_id": "string"
}

返回说明

返回参数

字段	类型	描述
message	string	提示信息
code	int	状态码
data	object	抽取任务结果
+status	int	IDP抽取模型任务状态。Default: "0"，Enum: "0" "1" "2" "3" "4" "5"。0: 默认值；1: 等待中；2: 识别中；3: 抽取中；4: 成功；5: 失败。
+task_id	string	任务id
+general_results	Array of objects	通用文字结果
++tables	Array of objects	表格识别信息
+++column	int	表格总列数
+++cells	Array of objects	表格中单元格的信息)
++++end_col	int	单元格终止列
++++start_row	int	单元格起始行（单元格在首行时，start_row=0，如果起始行和终止行都是0，说明进行所有列单元格合并）
++++positions	Array of objects	单元格坐标（左上角起，顺时针一周四角坐标形成的集合）
+++++y	int
+++++x	int
++++content	string	单元格的文本内容
++++end_row	int	单元格终止行
++++start_col	int	单元格起始列（单元格在首列时，start_col=0，如果起始列和终止列都是0，说明进行所有行单元格合并)
+++table_id	int	表格编号，从0开始
+++row	int	表格总行数
++struct_content	object	ocrStructContent
+++paragraph	Array of objects	识别结果的页面信息数组
++++content	string	识别的文本内容
++++paragraph_id	int	段落编号
++items	Array of objects	文本块识别信息)
+++probabilities	Array of objects	识别结果中单个文字的概率
++++char	string
++++probability	float
+++positions	Array of objects	文本块坐标(左上角起，顺时针一周四角坐标形成的集合)
++++y	int
++++x	int
+++content	string	文本块内容
+++char_positions	Array of objects	每个文字的坐标数组，长度应等于content的长度
++++positions	Array of objects	文字坐标(左上角起，顺时针一周四角坐标形成的集合)
+++++y	int
+++++x	int
++rotated_image_width	int	旋转后的图像宽度
++image_angle	int	图片旋转角度
++rotated_image_height	int	旋转后的图像高度
+fields	Array of objects	抽取结果
++field_type	int	字段类型。Default: "0"，Enum: "0" "1" "2"。0: 默认值不使用，1: 字符串，2: 字符串数组
++text_list	object	字符串数组对象
+++values	Array of objects	字符串值数组
++++positions	Array of objects	坐标区域
+++++positions	Array of objects	坐标
++++++content	string	文字内容
++++++position	Array of objects	左上/右上/右下/左下 4 个点的坐标
+++++++y	int
+++++++x	int
+++++page_num	int	页码
++++value	string	值
++field_name	string	字段名称
++field_hash	string	字段hash
++text	object	IDP字符串值
+++positions	Array of objects	坐标区域
++++positions	Array of objects	坐标
+++++content	string	文字内容
+++++position	Array of objects	左上/右上/右下/左下 4 个点的坐标
++++++y	int
++++++x	int
++++page_num	int	页码
+++value	string	值
+validate_results	Array of objects	规则校验结果
++rule_type	int	字段校验规则类型。Default: "0"，Enum: "0" "1" "2"。0: 默认值不使用，1: 非空，2: 代码块
++hash	string	规则hash
++name	string	规则名称
++success	boolean	是否成功
++field_names	Array of strings	字段名称列表
++fail_detail	string	错误消息
++progress	int	完成百分比
++ocr_results	Array of objects	OCR识别结果
+++tables	Array of objects	表格识别信息
++++column	int	表格总列数
++++cells	Array of objects	表格中单元格的信息
+++++end_col	int	单元格终止列
+++++start_row	int	单元格起始行（单元格在首行时，start_row=0，如果起始行和终止行都是0，说明进行所有列单元格合并）
+++++positions	Array of objects	单元格坐标（左上角起，顺时针一周四角坐标形成的集合）
++++++y	int
++++++x	int
+++++content	string	单元格的文本内容
+++++end_row	int	单元格终止行
+++++start_col	int	单元格起始列（单元格在首列时，start_col=0，如果起始列和终止列都是0，说明进行所有行单元格合并)
++++table_id	int	表格编号，从0开始
++++row	int	表格总行数
+++struct_content	object	ocrStructContent
++++paragraph	Array of objects	识别结果的页面信息数组
+++++content	string	识别的文本内容
+++++paragraph_id	int	段落编号
++++page	Array of objects	识别结果的段落信息数组
+++++content	string	识别的文本内容
+++++page_id	int	页面编号
++++row	Array of objects	识别结果的行信息数组
+++++content	string	识别的文本内容
+++++row_id	int	行编号
+++items	Array of objects	文本块识别信息
++++probabilities	Array of objects	识别结果中单个文字的概率
+++++char	string
+++++probability	float
++++positions	Array of objects	文本块坐标(左上角起，顺时针一周四角坐标形成的集合)
+++++y	int
+++++x	int
++++content	string	文本块内容)
++++char_positions	Array of objects	每个文字的坐标数组，长度应等于content的长度
+++++positions	Array of objects	文字坐标(左上角起，顺时针一周四角坐标形成的集合)
++++++y	int
++++++x	int
+++rotated_image_width	int	旋转后的图像宽度
+++image_angle	int	图片旋转角度
+++rotated_image_height	int	旋转后的图像高度

返回示例

{
  "message": "string",
  "code": 0,
  "data": {
    "status": "0",
    "task_id": "string",
    "general_results": [
      {
        "tables": [
          {
            "column": 0,
            "cells": [
              {
                "end_col": 0,
                "start_row": 0,
                "positions": [
                  {
                    "y": 0,
                    "x": 0
                  }
                ],
                "content": "string",
                "end_row": 0,
                "start_col": 0
              }
            ],

API接口文档

常见错误码​

限流说明​

限流规则：​

通用文字识别​

描述​

请求说明​

请求方式​

请求参数​

请求代码示例​

返回说明​

返回参数​

返回示例​

通用表格识别​

接口描述​

请求说明​

请求方式​

请求参数​

请求示例说明​

返回说明​

返回参数​

返回示例​

自定义模板识别​

接口描述​

请求说明​

请求方式​

请求参数​

请求代码示例​

返回说明​

返回参数​

返回示例​

通用卡证识别​

接口描述​

请求说明​

请求方式​

请求参数​

请求代码示例​

返回说明​

返回参数​

返回示例​

印章识别​

接口描述​

请求说明​

请求方式​

请求参数​

请求代码示例​

返回说明​

返回参数​

返回示例​

通用多票据识别​

接口描述​

请求说明​

请求方式​

请求参数​

请求代码示例​

返回说明​

返回参数​

返回示例​

验证码识别​

接口描述​

请求说明​

请求方式​

请求参数​

请求代码示例​

返回说明​

返回参数​

返回示例​

文档抽取​

提交任务​

请求说明​

请求方式​

请求参数​

请求代码示例​

返回说明​

返回参数​

返回示例​

获取结果​

请求说明​

请求方式​

请求参数​

常见错误码

限流说明

限流规则：

通用文字识别

描述

请求说明

请求方式

请求参数

请求代码示例

返回说明

返回参数

返回示例

通用表格识别

接口描述

请求说明

请求方式

请求参数

请求示例说明

返回说明

返回参数

返回示例

自定义模板识别

接口描述

请求说明

请求方式

请求参数

请求代码示例

返回说明

返回参数

返回示例

通用卡证识别

接口描述

请求说明

请求方式

请求参数

请求代码示例

返回说明

返回参数

返回示例

印章识别

接口描述

请求说明

请求方式

请求参数

请求代码示例

返回说明

返回参数

返回示例

通用多票据识别

接口描述

请求说明

请求方式

请求参数

请求代码示例

返回说明

返回参数

返回示例

验证码识别

接口描述

请求说明

请求方式

请求参数

请求代码示例

返回说明

返回参数

返回示例

文档抽取

提交任务

请求说明

请求方式

请求参数

请求代码示例

返回说明

返回参数

返回示例

获取结果

请求说明

请求方式

请求参数

请求代码示例