PDF-Text-OCR
Command Description
Return the page number specified by the PDF through the Laiye Intelligent Document Processing Text OCR, Recognition Result Return to the JSON Format During the Identification of the multi-page, the entire identification will return an error, and the quota will consume quotas
Command Prototype
jsonRet = Mage.PDFOCRText(config, path,password,all_pg_state,page_cfg,sleepTime,time)
Parameter Description
Parameter | Required | Type | Default | Description |
---|---|---|---|---|
config | True | expression | {} | Call configuration of Laiye IDP |
path | True | path | '''C:\Users''' | Path of the PDF file |
password | True | string | "" | PDF file password. No need to fill in if no password |
all_pg_state | True | boolean | None | When all the page numbers are set to "Yes", they can identify all and specified page number input invalid settings, can specify the page number identification |
page_cfg | True | expression | [[1,2]] | Positive integer and array format are supported, for example, if users input 2, then the second page would be identified. If users input [1,3,5], the first, third, and fifth pages would be identified. If users input [1,[6,9],4], the first, fourth, sixth and nineth pages would be identified. When the "Recognize all pages" is set to "Yes", the value input to specify pages would be invalid. An error will be reported if the number of pages exceeds the total number of PDF pages, and each page will be identify only once |
sleepTime | True | number | 10000 | The interval (in milliseconds) of each page in the PDF file is by default 10,000 milliseconds (10 seconds). The larger number of pages to be recognized and the short interval may lead to the exception of exceeded call frequencies |
time | True | number | 30000 | Specify the waiting time in milliseconds. If exceeded, an exception will be thrown. Default: 30,000 milliseconds (30 seconds) |
return
jsonRet,The variable used to save the output of the command.
Demo
Dim path='''C:\Users''' // The path of the PDF to be recognized
Dim config={"Pubkey":"","Secret":"","Url":""} // Get from mage
TracePrint "--------------------PDF Text Recognition--------------------"
// --------------------------------------------------------
// [Remarks] Identify the text content of the specified target on the screen
// Input parameter 1:
// config--mage configuration, need to configure Pubkey and Secret.Type:Dict
// Input parameter 2:
// path--The path of the image to be recognized.Type:String
// Input parameter 3:
// password--password. No password is required. Type:String
// Input parameter 4:
// all_pg_state--whether all pages are recognized.Type:Bool
// Input parameter 5:
// page_cfg--Identify the specified page number.Type:List
// Input parameter 6:
// sleepTime--interval time. Default unit: milliseconds. Type: Int
// Input reference 7:
// time--timeout time. Default unit: milliseconds. Type: Int
// Output parameters:
// jsonRet: The variable to which the output of the function call is saved
// Command prototype: jsonRet = Mage.PDFOCRText(config,path,password,all_pg_state,page_cfg,sleepTime,time)
// --------------------------------------------------------
jsonRet = Mage.PDFOCRText(config, path,"",false,[1],10000,30000)
TracePrint(jsonRet)