| OCR REST API Developer's Guide | |
|
OCR Web Service REST API can be easily consumed by any programming languages or applications that accept web services
integration. In this section, examples are shown for the following technologies.Sample codes available here:
| |
|
Since the API is based on REST principles, it's very easy to write and test applications. You can use your browser to access URLs, and you can use pretty much any HTTP client in any programming language to interact with the API.
Here you can find the description of REST API methods:
REST method |
processDocument |
HTTP type |
[POST] |
Base URL |
http(s)://www.ocrwebservice.com/restservices/processDocument |
Description |
This method allows you to recognize a scanned document or image. The method loads the image and processes it with the specified parameters. The result of method execution is the extracted text or url to download output file. |
| |
|
Input parameters:
Parameter name |
Default value |
Description |
language |
"english" |
Specifies the recognition language. This parameter can contain several language names separated with commas, for example "language=english,german,spanish" (see supported recognition languages) |
pagerange |
"allpages" |
Enter page numbers and/or page ranges separated by commas. For example "pagerange=1,3,5-12" or "pagerange=allpages". |
tobw |
"false" |
Convert image to black and white (recommend for color image and photo). |
zone |
|
Specifies the region on the image for zonal OCR. The coordinates in pixels relative to the left top corner in the following format: top:left:height:width.
This parameter can contain several zones separated with commas, for example "zone=0:0:100:100,50:50:50:50" |
outputformat |
|
Specifies the output file format (see supported output formats). Can be specified up to two output formats, separated with commas (example: "pdf,txt") |
gettext |
"false" |
Specifies that extracted text will be returned. |
getwords |
"false" |
If it is TRUE the recognized word`s coordinates will be returned. |
newline |
|
If it is "newline=1" the extracted text will be returned with new line (\n) character |
description | |
Optional parameter. Specifies your task description. Will be returned in response. |
| |
|
NOTE: All parameters are optional. But for processing you must specify 'gettext' or 'outputformat'
Example:
http://www.ocrwebservice.com/restservices/processDocument?language=english&pagerange=1-5&gettext=true&outputformat=doc
| |
|
HTTP Status codes:
OCR Web Service REST API returns the following HTTP status codes:
HTTP status code |
Description |
200 |
Success request |
400 |
Bad Request |
401 |
Unauthorized Request |
402 |
Payment Required |
500 |
Internal Server Error |
| |
|
Response formats:OCR Web Service REST API returns HTTP responses in the following formats:
Format |
Response type |
Requested via |
XML |
Complex value responses |
Application/xml in the HTTP Accept header |
JSON |
Complex value responses |
Application/json in the HTTP Accept header |
| |
|
Response format fields:Response format contains following fields:
Field |
Description |
AvailablePages |
Available pages for the current subscription plan |
ProcessedPages |
The number of pages processed for the last task |
OCRText |
Two dimension array that contains extracted text from each zone and page. OCRedText[z][p]: z-zone, p-pages |
OutputFileUrl |
The URL for the converted output file |
TaskDescription |
This custom field will contain the same value as specified in the input 'description' parameter |
Reserved |
Reserved fields |
ErrorMessage |
If error occured during processing this field will contain error message |
| |
|
Example of response format (JSON) for the status code 200:
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: nnn
{
"ErrorMessage":"",
"AvailablePages" :287848,
"ProcessedPages" :5,
"OCRText":
[
[
"Sample extracted text from zone 01 (Page 01)",
"Sample extracted text from zone 01 (Page 02)"
],
[
"Sample extracted text from zone 02 (Page 01)",
"Sample extracted text from zone 02 (Page 02)"
]
],
"OutputFileUrl":"http://www.ocrwebservice.com/uploads/_restservice/201504/08/9b66_468697e9-7986-4be1-9da6-ee62a617a2d1.doc",
"TaskDescription":"sampletask",
"Reserved":[]
}
| |
|
Example of response format (JSON) for the status code 400 (Bad request):
HTTP/1.1 400 BAD REQUEST
Content-Type: application/json
Content-Length: nnn
{
"OCRErrorMessage":"Recognition result has not been specified",
"AvailablePages" :287848,
"ProcessedPages" :0,
"OCRText":[],
"OutputFileUrl":"",
"TaskDescription":"",
"Reserved":[]
}
| |
|
REST method |
getAccountInformation |
HTTP type |
[GET] |
Base URL |
http(s)://www.ocrwebservice.com/restservices/getAccountInformation |
Description |
This method allows you to get the current account information |
| |
|
Response format fields:Response format contains following fields:
Field |
Description |
AvailablePages |
Available pages for the current subscription plan |
MaxPages |
Maximum pages for the current subscription plan |
LastProcessingTime |
Last processing time |
SubcriptionPlan |
The current subscription plan |
ExpirationDate |
Date when license will be expired |
ErrorMessage |
If error occured during processing this field will contain error message |
| |