Package nsi :: Package granulate :: Module GranulatePDF' :: Class GranulatePDF
[hide private]
[frames] | no frames]

Class GranulatePDF

source code

object --+
         |
        GranulatePDF


- Provide the grain extraction functionality for PDF documents
- Retrieve tables and images

Instance Methods [hide private]
 
__call__(self) source code
 
__del__(self)
When the object is destroyed, the temporary folder is removed with everything inside of it.
source code
 
__getImageDocumentList(self)
Retrieves images from a PDF document
source code
 
__getTableDocumentList(self)
Extract tables from a pdf file using pyPdf2Table
source code
 
__init__(self, Document=None)
Checks if the Document is a PDF file, then creates a temporary folder and saves...
source code
 
getImageDocumentList(self)
Invoke the private method __getImageDocumentList in order to retrieve the document's images
source code
 
getTableDocumentList(self)
Invoke the private method __getTableDocumentList in order to retrieve the document's tables
source code
 
granulateDocument(self)
Extract the grains from a document, returning a dictionary with a list of tables and a list of images
source code

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Variables [hide private]
  Document = None
  __pathFolder = None
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, Document=None)
(Constructor)

source code 

Checks if the Document is a PDF file, then creates a temporary folder and saves
the PDF file in the filesystem

Overrides: object.__init__