Skip to main content

PDF-Text-OCR

Command Description

Return the page number specified by the PDF through the Laiye Intelligent Document Processing Text OCR, Recognition Result Return to the JSON Format During the Identification of the multi-page, the entire identification will return an error, and the quota will consume quotas

Command Prototype

jsonRet = Mage.PDFOCRText(config, path,password,all_pg_state,page_cfg,sleepTime,time)

Parameter Description

ParameterRequiredTypeDefaultDescription
configTrueexpression{}Call configuration of Laiye IDP
pathTruepath'''C:\Users'''Path of the PDF file
passwordTruestring""PDF file password. No need to fill in if no password
all_pg_stateTruebooleanNoneWhen all the page numbers are set to "Yes", they can identify all and specified page number input invalid settings, can specify the page number identification
page_cfgTrueexpression[[1,2]]Positive integer and array format are supported, for example, if users input 2, then the second page would be identified. If users input [1,3,5], the first, third, and fifth pages would be identified. If users input [1,[6,9],4], the first, fourth, sixth and nineth pages would be identified. When the "Recognize all pages" is set to "Yes", the value input to specify pages would be invalid. An error will be reported if the number of pages exceeds the total number of PDF pages, and each page will be identify only once
sleepTimeTruenumber10000The interval (in milliseconds) of each page in the PDF file is by default 10,000 milliseconds (10 seconds). The larger number of pages to be recognized and the short interval may lead to the exception of exceeded call frequencies
timeTruenumber30000Specify the waiting time in milliseconds. If exceeded, an exception will be thrown. Default: 30,000 milliseconds (30 seconds)

return

jsonRet,The variable used to save the output of the command.

Demo

Dim path='''C:\Users'''  // The path of the PDF to be recognized 
Dim config={"Pubkey":"","Secret":"","Url":""} // Get from mage

TracePrint "--------------------PDF Text Recognition--------------------"
// --------------------------------------------------------
// [Remarks] Identify the text content of the specified target on the screen
// Input parameter 1:
// config--mage configuration, need to configure Pubkey and Secret.Type:Dict
// Input parameter 2:
// path--The path of the image to be recognized.Type:String
// Input parameter 3:
// password--password. No password is required. Type:String
// Input parameter 4:
// all_pg_state--whether all pages are recognized.Type:Bool
// Input parameter 5:
// page_cfg--Identify the specified page number.Type:List
// Input parameter 6:
// sleepTime--interval time. Default unit: milliseconds. Type: Int
// Input reference 7:
// time--timeout time. Default unit: milliseconds. Type: Int

// Output parameters:
// jsonRet: The variable to which the output of the function call is saved

// Command prototype: jsonRet = Mage.PDFOCRText(config,path,password,all_pg_state,page_cfg,sleepTime,time)
// --------------------------------------------------------

jsonRet = Mage.PDFOCRText(config, path,"",false,[1],10000,30000)
TracePrint(jsonRet)