Class SMCCrawler

Description

Loading external PHPCrawler-class

Uncomment for standalone

  • author: Uwe Hunfeld (phpcrawl@cuab.de)
  • version: 0.81

Located in /SitemapCreatorCrawler.class.php (line 25)

PHPCrawler
   |
   --SMCCrawler
Variable Summary
Method Summary
void addURL_Entry (array $entry)
void enableLastModifiedCount (bool $mode)
void getLastModified (PHPCrawlerDocumentInfo $PageInfo,  &$entry)
Variables
Methods
addURL_Entry (line 86)

add URL entry $entries

  • section: 3 Crawler
  • access: protected
void addURL_Entry (array $entry)
  • array $entry: URL set to be added to sitemap
enableLastModifiedCount (line 96)

Enable or diable last-Modified calculation $LastModifiedCount

  • section: 3 Crawler
  • access: public
void enableLastModifiedCount (bool $mode)
  • bool $mode: trure to enable, false otherwise
getLastModified (line 66)

get Last-Modified header

  • section: 3 Crawler
  • access: protected
void getLastModified (PHPCrawlerDocumentInfo $PageInfo,  &$entry)
  • PHPCrawlerDocumentInfo $PageInfo: A PHPCrawlerDocumentInfo-object containing all information about the currently received document.
  • &$entry
handleDocumentInfo (line 48)

get access to all information about a page or file the crawler found and received.

  • section: 3 Crawler
  • access: public
void handleDocumentInfo (PHPCrawlerDocumentInfo $PageInfo)
  • PHPCrawlerDocumentInfo $PageInfo: A PHPCrawlerDocumentInfo-object containing all information about the currently received document.

Redefinition of:
PHPCrawler::handleDocumentInfo()
Override this method to get access to all information about a page or file the crawler found and received.

Inherited Methods

Inherited From PHPCrawler

PHPCrawler::__construct()
PHPCrawler::addBasicAuthentication()
PHPCrawler::addContentTypeReceiveRule()
PHPCrawler::addFollowMatch()
PHPCrawler::addLinkExtractionTags()
PHPCrawler::addLinkPriority()
PHPCrawler::addLinkSearchContentType()
PHPCrawler::addNonFollowMatch()
PHPCrawler::addPostData()
PHPCrawler::addReceiveContentType()
PHPCrawler::addReceiveToMemoryMatch()
PHPCrawler::addReceiveToTmpFileMatch()
PHPCrawler::addStreamToFileContentType()
PHPCrawler::addURLFilterRule()
PHPCrawler::addURLFollowRule()
PHPCrawler::checkForAbort()
PHPCrawler::cleanup()
PHPCrawler::createWorkingDirectory()
PHPCrawler::disableExtendedLinkInfo()
PHPCrawler::enableAggressiveLinkSearch()
PHPCrawler::enableCookieHandling()
PHPCrawler::enableResumption()
PHPCrawler::getCrawlerId()
PHPCrawler::getProcessReport()
PHPCrawler::getReport()
PHPCrawler::go()
PHPCrawler::goMultiProcessed()
PHPCrawler::handleDocumentInfo()
PHPCrawler::handleHeaderInfo()
PHPCrawler::handlePageData()
PHPCrawler::initChildProcess()
PHPCrawler::initCrawlerProcess()
PHPCrawler::obeyNoFollowTags()
PHPCrawler::obeyRobotsTxt()
PHPCrawler::processRobotsTxt()
PHPCrawler::processUrl()
PHPCrawler::resume()
PHPCrawler::setAggressiveLinkExtraction()
PHPCrawler::setConnectionTimeout()
PHPCrawler::setContentSizeLimit()
PHPCrawler::setCookieHandling()
PHPCrawler::setFollowMode()
PHPCrawler::setFollowRedirects()
PHPCrawler::setFollowRedirectsTillContent()
PHPCrawler::setLinkExtractionTags()
PHPCrawler::setPageLimit()
PHPCrawler::setPort()
PHPCrawler::setProxy()
PHPCrawler::setStreamTimeout()
PHPCrawler::setTmpFile()
PHPCrawler::setTrafficLimit()
PHPCrawler::setURL()
PHPCrawler::setUrlCacheType()
PHPCrawler::setUserAgentString()
PHPCrawler::setWorkingDirectory()
PHPCrawler::starControllerProcessLoop()
PHPCrawler::startChildProcessLoop()

Documentation generated on Sun, 20 Jan 2013 21:18:51 +0200 by phpDocumentor 1.4.4