Skip to contents

Create a gocr Configuration

Usage

gocrConfig(
  inputfile,
  showhelp = FALSE,
  outputfile = "",
  errorfile = "",
  progressfile = "",
  databasepath = "",
  outputformat = "",
  greylevel = 0,
  dustsize = -1,
  spacewidth = 0,
  verbosity = 0,
  limitVerbosityToChars = "",
  limitRecognitionToChars = "",
  certainty = 95,
  mode = 0,
  onlyRecogniseNumbers = FALSE
)

Arguments

inputfile

-i file: read input from file (or stdin if file is a single dash)

showhelp

-h: show usage information

outputfile

-o file: send output to file instead of stdout. If "" (default), a file "gocrOut_<basename(outputfile)>" in tempdir() is used as the output file.

errorfile

-e file: send errors to file instead of stderr or to stdout if file is a dash

progressfile

-x file: progress output to file (file can be a file name, a fifo name or a file descriptor 1...255), this is useful for GUI developpers to show the OCR progress, the file descriptor argument is only available, if compiled with __USE_POSIX defined

databasepath

-p path: database path, that will be populated with images of learned characters. If "" (default), and a database is needed, a directory within the folder of the installed package is used

outputformat

-f format: output format of the recognized text (ISO8859_1 TeX HTML XML UTF8 ASCII), XML will also output position and probability data

greylevel

-l level set grey level to level (0<160<=255, default: 0 for autodetect), darker pixels belong to characters, brighter pixels are inter- preted as background of the input image

dustsize

-d size: set dust size in pixels (clusters smaller than this are removed), 0 means no clusters are removed, the default is -1 for auto detection

spacewidth

-s num: set spacewidth between words in units of dots (default: 0 for autodetect), wider widths are interpreted as word spaces, smaller as character spaces

verbosity

-v verbosity: be verbose to stderr; verbosity is a bitfield. Use optionValueVerbosity to get a proper value

limitVerbosityToChars

-c string: only verbose output of characters from string to stderr, more output is generated for all characters within the string, the

limitRecognitionToChars

-C string: only recognise characters from string, this is a filter function in cases where the interest is only to a part of the character alphabet

certainty

-a certainty: set value for certainty of recognition (0..100; default: 95), characters with a higher certainty are accepted, characters with a lower certainty are treated as unknown (not recognized); set higher values, if you want to have only more certain recognized characters

mode

-m mode: set oprational mode; mode is a bitfield (default: 0). Use optionValueMode to get a proper value

onlyRecogniseNumbers

-n bool: if bool is non-zero, only recognise numbers (this is now obsolete, use -C "0123456789")