Module orchid :: Class Orchid
[show private | hide private]
[frames | no frames]

Class Orchid


The main class of the crawler. Use this to start the crawling process.
Method Summary
  __init__(self, seed, fetcherThreads, maxUrlsToCrawl, timeOut, delay, analyzer, args)
Creates a new crawler.
  crawl(self)
Performs the crawling operation.

Method Details

__init__(self, seed, fetcherThreads, maxUrlsToCrawl, timeOut, delay, analyzer=<class 'orchid.NaiveAnalyzer'>, args=[])
(Constructor)

Creates a new crawler.
Parameters:
seed - A map of domain names to urls in that domain from which to start crawling.
fetcherThreads - The number of fetcher threads to use.
maxUrlsToCrawl - How many pages to crawl.
timeOut - The socket timeout for loading a page.
delay - The delay between crawls.
analyzer - The class of the analyzer to use.

crawl(self)

Performs the crawling operation.

Generated by Epydoc 2.1 on Mon Dec 12 14:30:34 2005 http://epydoc.sf.net