This section outlines the steps required to install ImageDataExtractor.
We strongly advise the use of a virtual environment when installing ImageDataExtractor (Click here to learn how.)
ImageDataExtractor currently uses Tesseract 4 for text recognition. You can check your existing version by running:
$ tesseract -v
The source code for the correct installation can be downloaded here if required. Instructions for compiling on your machine can be found here.
pip is the simplest option. Simply run:
pip install imagedataextractor
Clone the repo and move into the directory:
git clone https://github.com/by256/imagedataextractor.git cd imagedataextractor
Activate your virtual environment and install:
python setup.py install
Finally, download the data files necessary to be able to use ChemDataExtractor-based document extraction:
cde data download
and you're ready to go!