"Fossies" - the Fresh Open Source Software Archive

Member "checkbot-1.80/ChangeLog" (15 Oct 2008, 45564 Bytes) of package /linux/www/old/checkbot-1.80.tar.gz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 2008-10-15  Hans de Graaff  <hans@degraaff.org>
    2 
    3 	* Checkbot 1.80 is released
    4 
    5 2008-07-08  Hans de Graaff  <hans@degraaff.org>
    6 
    7 	* checkbot (handle_doc): Tighten up the check for a robots tag so
    8 	that nofollow text later in the document won't be matched, thus
    9 	skipping the whole document, bug 2005950.
   10 
   11 2007-05-05  Brandon Bell  <Brandon_Bell@bcit.ca>
   12 
   13 	* checkbot: mms scheme can be ignored safely.
   14 
   15 2007-04-30  Hans de Graaff  <hans@degraaff.org>
   16 
   17 	* checkbot (printAllServers): Clarify that 'Unique links' actually
   18 	is 'Documents scanned'.
   19 
   20 2007-02-26  Hans de Graaff  <hans@degraaff.org>
   21 
   22 	* checkbot (handle_doc): Handle the case where decoded_content is
   23 	not available as per bug 1665075.
   24 
   25 2007-02-26  Gerald Preifer  <gerald@pfeifer.com>
   26 
   27 	* checkbot (check_point): Simplify and add a comment.
   28 
   29 2007-02-26  Hans de Graaff  <hans@degraaff.org>
   30 
   31 	* Makefile.PL: Require LWP 5.803 or better. decoded_content got
   32 	added in 5.802 and 5.803 added some important bugfixes.
   33 
   34 2007-02-03  Hans de Graaff  <hans@degraaff.org>
   35 
   36 	* Checkbot 1.79 is released
   37 
   38 	* RELEASE-PROCESS: Add the release process documentation.
   39 
   40 2007-01-27  Gerald Pfeifer  <gerald@pfeifer.com>
   41 
   42 	* checkbot (init_suppression): Check and provide error if
   43 	suppression file is in fact a directory.
   44 
   45 2006-12-28  Hans de Graaff  <hans@degraaff.org>
   46 
   47 	* checkbot: Add summary to tables to make files XHTML 1.1 compliant.
   48 
   49 2006-11-16  Hans de Graaff  <hans@degraaff.org>
   50 
   51 	* checkbot (handle_doc): Parse the decoded content so that all
   52 	character set issues are dealt with before parsing. This solves
   53 	bug 1264729.
   54 
   55 2006-11-14  Hans de Graaff  <hans@degraaff.org>
   56 
   57 	* checkbot (performRequest): Simplify the code dealing with
   58 	problems of HEAD requests by retrying all 500 reponses instead of
   59 	special-cases particular failures that we happen to know
   60 	about. This type of problem is all to common, and if there really
   61 	is a problem GET will find it anyway.
   62 	(add_error): Allow regular expressions in the suppression
   63 	file. Based on patch from Eric Noack
   64 
   65 2006-11-14  Eric Noack  <en@lightwerk.com>
   66 
   67 	* checkbot (send_mail): Indicate how many errors are detected in
   68 	the notification email's subject.
   69 	(handle_doc): Use the URL with which the document was received for
   70 	the problem reports and internal accounting, but keep on using the
   71 	proper base URL as defined by the reponse object when retrieving
   72 	links from the document. This fixes the case where a weird BASE
   73 	URL in a document could make it unclear where the actual problem
   74 	was.
   75 
   76 2006-10-28  Hans de Graaff  <hans@degraaff.org>
   77 
   78 	* checkbot (performRequest): Handle case where an FTP server may
   79 	not be able to handle a HEAD request. This may cause a lot of data
   80 	to be transferred in those cases.
   81 
   82 2006-05-03  Hans de Graaff  <hans@degraaff.org>
   83 
   84 	* Checkbot 1.78 is released
   85 
   86 2005-12-18  Hans de Graaff  <hans@degraaff.org>
   87 
   88 	* checkbot (printServerProblems): Make pages XHTML compliant again.
   89 
   90 2005-12-18  Jens Schweikhardt  <schweikh@schweikhardt.net>
   91 
   92 	* checkbot: Add classes and ids so that more styling options for
   93 	CSS are available.
   94 	* checkbot2.css: Example CSS file using the new classes and ids.
   95 
   96 2005-11-11  Hans de Graaff  <hans@degraaff.org>
   97 
   98 	* checkbot: React in a more subtle way if the Time::Duration
   99 	module is not found.
  100 
  101 2005-09-22  Hans de Graaff  <hans@degraaff.org>
  102 
  103 	* Makefile.PL: Check for presence of Net::SSL and explain the
  104 	effects if this it not present.
  105 
  106 2005-08-20  Hans de Graaff  <hans@degraaff.org>
  107 
  108 	* checkbot (handle_doc): Ignore some 'links' found by LinkExtor
  109 	which do not need to link to live links. Fixed bugs #1264447 and
  110 	#1107832. 
  111 
  112 	* test.html: Add test cases for it.
  113 
  114 2005-08-06  Hans de Graaff  <hans@degraaff.org>
  115 
  116 	* checkbot (performRequest): Switch from HEAD to GET on a 400
  117 	error, as the most likely cause is that the server has trouble
  118 	with HEAD requests.
  119 
  120 2005-08-05  Hans de Graaff  <hans@degraaff.org>
  121 
  122 	* checkbot (handle_doc): Also show how many new links are found on
  123 	a page, not just the total number of links.
  124 	(performRequest): Don't retry GET method on a 403 error.
  125 	(handle_doc): Properly handle newlines in the matches for title
  126 	and robots meta tag.
  127 
  128 2005-07-28  Hans de Graaff  <hans@degraaff.org>
  129 
  130 	* Checkbot 1.77 is released.
  131 
  132 	* checkbot: Fix use of $VERSION so that it compiles and can be
  133 	used by MakeMaker at the same time.
  134 	(handle_doc): Check for presence of robots meta tag and act on it.
  135 	Based on a patch by Donald Willingham.
  136 
  137 2005-07-25  Hans de Graaff  <hans@degraaff.org>
  138 
  139 	* Checkbot 1.76 is released.
  140 
  141 2005-06-07  Hans de Graaff  <hans@degraaff.org>
  142 
  143 	* checkbot (printServerProblems): Include title of page.
  144 	(handle_doc): Extract title for later printing.
  145 	Add new hash url_title to store page titles.
  146 	Based on a patch from John Bintz.
  147 
  148 2005-04-23  Hans de Graaff  <hans@degraaff.org>
  149 
  150 	* checkbot: Add documentation on use of file:/// URLs.
  151 
  152 2005-01-23  Hans de Graaff  <hans@degraaff.org>
  153 
  154 	* checkbot: Only send mail when Checkbot has detected any
  155 	problems, based on suggestion from Thomas Kuerten.
  156 
  157 	Print duration of run on final report, and refactor use of start
  158 	time variable to facilitate this. Feature depends on availability
  159 	of Time::Duration, but checkbot will work without it. Based on
  160 	patch from Adam Griff.
  161 
  162 2005-01-23  Adam Griff <griff@computer.org>
  163 	
  164 	* checkbot (create_page): Print out more options on results page.
  165 	
  166 2005-01-21  Hans de Graaff  <hans@degraaff.org>
  167 
  168 	* checkbot: Remove automatic version number based on CVS version
  169 	now that commits will be more frequent than releases.
  170 
  171 2004-11-12  Hans de Graaff  <hans@degraaff.org>
  172 
  173 	* checkbot (handle_url): Ignore javascript: URLs instead of
  174 	generating a 904 error. It would be nice to handle these as well.
  175 
  176 2004-05-26  Hans de Graaff  <hans@degraaff.org>
  177 
  178 	* Makefile.PL: Sync HTML::Parser requirement with required
  179 	versions of libwww-perl.
  180 
  181 2004-05-03  Hans de Graaff  <hans@degraaff.org>
  182 
  183 	* checkbot: Write better documentation for --file option.
  184 
  185 2004-04-26  Hans de Graaff  <hans@degraaff.org>
  186 
  187 	* checkbot: Minor documentation changes thank to Jens
  188 	Schweikhardt.
  189 
  190 2004-04-22  Hans de Graaff  <hans@degraaff.org>
  191 
  192 	* Checkbot 1.75 is released.
  193 
  194 2004-04-21  Hans de Graaff  <hans@degraaff.org>
  195 
  196 	* checkbot (print_help): Use a here-doc for the help for easier
  197 	maintenance.
  198 	(init_modules): Add --noproxy options to set list of domains which
  199 	will not be passed through the proxy.
  200 
  201 2004-04-18  Hans de Graaff  <hans@degraaff.org>
  202 
  203 	* checkbot (handle_url): Create an error if an unknown scheme is
  204 	encountered and only ignore known schemes like mailto:
  205 
  206 2004-03-30  Hans de Graaff  <hans@degraaff.org>
  207 
  208 	* checkbot: Add explanation about error message which indicates
  209 	lack of SSL support.
  210 
  211 2004-03-28  Hans de Graaff  <hans@degraaff.org>
  212 
  213 	* checkbot: Add EXAMPLES section to the perldoc documentation with
  214 	an example of the most simple invocation. Needs more examples...
  215 	Update help text for --mailto to confirm that more than one
  216 	address is possible.
  217 
  218 	* checkbot: Add new --cookies option to accept cookies from
  219 	servers. Based on patch from Roger Pilkey.
  220 
  221 2004-02-09  Hans de Graaff  <hans@degraaff.org>
  222 
  223 	* Makefile.PL: Show correct text if LWP test fails.
  224 
  225 2004-01-05  Hans de Graaff  <hans@degraaff.org>
  226 
  227 	* Makefile.PL: Now require LWP 5.76 to avoid problems with 500
  228 	"Need a field name" HTTP errors being generated by LWP.
  229 
  230 2003-12-29  Gerald Pfeifer  <gerald@pfeifer.com>
  231 
  232 	* checkbot: Improve description of --proxy.
  233 	(print_help): Ditto.
  234 
  235 2003-12-21  Hans de Graaff  <hans@degraaff.org>
  236 
  237 	* checkbot (performRequest): $url->authority may not be defined
  238 	for the URL we are checking.
  239 
  240 2003-12-17  Hans de Graaff  <hans@degraaff.org>
  241 
  242 	* Checkbot 1.74 is released
  243 
  244 	* checkbot (add_error): Take into account that status message can
  245 	be undefined.
  246 
  247 2003-12-15  Hans de Graaff  <hans@degraaff.org>
  248 
  249 	* checkbot: Put Checkbot errors in a hash to have one set of
  250 	descriptions around.
  251 	(handle_doc): Use it.
  252 	(checkbot_status_message): Use it to ind the status message for a
  253 	code from HTTP codes, Checkbot codes, or a generic status message.
  254 	(printServerProblems): Use it.
  255 	(handle_url): Move checks for --dontwarn and --suppression
  256 	features from here ...
  257 	(add_error): ... to here so that it applies to all errors.
  258 
  259 2003-12-14  Hans de Graaff  <hans@degraaff.org>
  260 
  261 	* checkbot: Document that Checkbot defines its own response codes
  262 	for common problems.
  263 	No longer a need for the %warning hash.
  264 	(add_error): New function to add a new error into the hashes.
  265 	(handle_url): Use it.
  266 	(handle_doc): Use it for what previously were warnings.
  267 	(printServerWarnings): Obsolete as warnings have been changed to
  268 	use the normal error handling routines.
  269 	Marked --allow-simple-hosts option as deprecated, because this can
  270 	now be handled in a more generic way by the --dontwarn mechanism.
  271 	(print_help): Removed --allow-simple-hosts option from help.
  272 	(add_to_queue): Move code to check for double slash in URL to ...
  273 	(handle_doc): ... here as Checkbot error 903.
  274 
  275 2003-11-29  Hans de Graaff  <hans@degraaff.org>
  276 
  277 	* checkbot (printServerProblems): Oops. Make sure all output is
  278 	going to the right file, not stdout.
  279 	Add new --suppress option which reads a file with response code /
  280 	URL combinations to be suppressed in the output, based on patch by
  281 	Rob Chekaluk.
  282 	(init_suppression): Read suppresson file and fill has with
  283 	results.
  284 	(handle_url): Use it.
  285 	(print_help): Document it.
  286 
  287 2003-11-24  Hans de Graaff  <hans@degraaff.org>
  288 
  289 	* checkbot: Add example to --ignore argument.
  290 
  291 2003-11-23  Hans de Graaff  <hans@degraaff.org>
  292 
  293 	* checkbot (init_modules): Delete commented-out code to enable
  294 	HTTP 1.1 in LWP. HTTP 1.1 has been the default in LWP for a while
  295 	and does not need special code to be enabled.
  296 
  297 2003-11-21  Hans de Graaff  <hans@degraaff.org>
  298 
  299 	* checkbot (printServerProblems): Don't assume that status_message
  300 	is defined for all possible codes, based on patch by Thomas
  301 	Kuerten.
  302 
  303 2003-10-18  Hans de Graaff  <hans@degraaff.org>
  304 
  305 	* Makefile.PL: Require LWP 5.70 because problems with HEAD of
  306 	ftp:// links have been solved in this release.
  307 
  308 2003-09-05  Hans de Graaff  <hans@degraaff.org>
  309 
  310 	* checkbot (printServerProblems): Put line breaks in HTML file in
  311 	a more logical place.
  312 
  313 2003-08-31  Hans de Graaff  <hans@degraaff.org>
  314 
  315 	* Checkbot 1.73 released
  316 
  317 2003-08-30  Hans de Graaff  <hans@degraaff.org>
  318 
  319 	* checkbot (printServerProblems): Protect against undefined status.
  320 
  321 2003-08-29  Hans de Graaff  <hans@degraaff.org>
  322 
  323 	* checkbot (handle_doc): Ignore URIs matching --ignore as they are
  324 	being found.
  325 	(handle_url): Remove check for --ignore option here.
  326 	Update documentation for --ignore.
  327 	(print_help): Idem.
  328 
  329 2003-08-21  Hans de Graaff  <hans@degraaff.org>
  330 
  331 	* checkbot: Made --interval description a bit more clear.
  332 
  333 2003-07-26  Hans de Graaff  <hans@degraaff.org>
  334 
  335 	* checkbot (init_modules): Uncomment proxy support, but it now
  336 	applies to all requests, not just external ones.
  337 	(print_help): Update --proxy help text.
  338 	Update perldoc documentation.
  339 
  340 2003-07-05  Hans de Graaff  <hans@degraaff.org>
  341 
  342 	* checkbot: Additional explanation for --exclude option.
  343 
  344 2003-06-28  Bernd Petrovitsch  <bernd@firmix.at>
  345 
  346 	* checkbot.css: Additional cleaning up of the CSS file.
  347 
  348 2003-06-26  Bernd Petrovitsch  <bernd@firmix.at>
  349 
  350 	* checkbot: Produce valid XHTML 1.1 pages.
  351 
  352 	* checkbot.css: Clean up of the CSS file.
  353 
  354 2003-05-04  Hans de Graaff  <hans@degraaff.org>
  355 
  356 	* Checkbot 1.72 released
  357 	
  358 	* checkbot: Applied spelling fixes from Jens Schweikhardt.
  359 	(clean_up): Factored out of check_links so that it can also be
  360 	called when we catch a signal.
  361 	(got_signal): Catch signals like SIGINT and handle them, based on
  362 	patch by Jens Schweikhardt.
  363 
  364 2003-04-06  Hans de Graaff  <hans@degraaff.org>
  365 
  366 	* checkbot (handle_url): No longer ignore URLs with a query
  367 	string. If checking these is not wanted then the --exclude option
  368 	can be used, and an example for this is now included in the
  369 	documentation.
  370 
  371 2003-03-30  Hans de Graaff  <hans@degraaff.org>
  372 
  373 	* checkbot (printServerProblems): Add links to different error
  374 	codes on a server page for quick navigation.
  375 
  376 2003-02-22  Paul Merchant, Jr.  <Paul.L.Merchant.Jr@Dartmouth.EDU>
  377 
  378 	* checkbot: Initialize the statistics counters to avoid warnings.
  379 
  380 2003-01-15  Hans de Graaff  <hans@degraaff.org>
  381 
  382 	* checkbot (output): Correct the check for --verbose; not
  383 	specifying it now generates no output.
  384 
  385 2003-01-06  Hans de Graaff  <hans@degraaff.org>
  386 
  387 	* checkbot (handle_doc): The host name check does not make much
  388 	sense for news: scheme URLs.
  389 
  390 2003-01-03  Hans de Graaff  <hans@degraaff.org>
  391 
  392 	* checkbot (init_globals): Only remove file from default --match
  393 	argument when there is a path component in the start URL.
  394 	Initialize problem counter to avoid warning about uninitialized
  395 	value.
  396 
  397 2002-12-29  Hans de Graaff  <hans@degraaff.org>
  398 
  399 	* Checkbot 1.71 released
  400 	
  401 	* checkbot (handle_url): Make sure we feed is_internal a string.
  402 	(handle_url): Use existing variable instead of Referer header to
  403 	store parent URL.
  404 
  405 	* Checkbot 1.70 created for testing, but not released
  406 
  407 	* checkbot (performRequest): Add HTTP 403 error to list of error
  408 	codes to retry with a GET.
  409 	(handle_url): Only follow redirections for internal links.
  410 
  411 2002-12-28  Hans de Graaff  <hans@degraaff.org>
  412 
  413 	* checkbot: Removed reference to AnyDBM_File because it is not
  414 	used anywhere.
  415 	Rewrote global statistics gathering to be more simple and more
  416 	accurate.
  417 	Added --filter option which allows rewriting of URLs before they
  418 	are checked, based on patch from Eli the Bearded <eli@netusa.net>.
  419 	Simplified storage of URLs with problems
  420 	(get_headers): Removed.
  421 	(performRequest): Included code from get_headers here.
  422 	(count_problems): Updated for new storage of URLs
  423 	(printServerProblems): Idem.
  424 	(handle_url): Idem.
  425 	(handle_doc): Idem.
  426 	(count_problems): Idem.
  427 	(printServerProblems): Idem.
  428 	(handle_doc): Add code to report all pages on which a problematic
  429 	URL appears.
  430 	(init_globals): Changed default --match argument to exclude final
  431 	page name.
  432 	
  433 
  434 2002-12-27  Hans de Graaff  <hans@degraaff.org>
  435 
  436 	* checkbot (output): Moved printing, including indentation and
  437 	verbose checking, to function 'output'.
  438 	(handle_doc): No more distinction between internal and external
  439 	links, we throw all links found in the queue.
  440 	(handle_doc): Removed statistics for now, they are too buggy.
  441 	(is_checked): New function takes into account that we sometimes
  442 	translate hostnames to IP addresses.
  443 	(handle_doc): Use it.
  444 	(check_internal): Removed dependency on statistics, use actual
  445 	queue contents to determine when all links are checked.
  446 	(handle_url): Only query server for file type on
  447 	application/octet-stream documents.
  448 	(is_internal): New function to determine if URL is internal.
  449 	(handle_url): Rewritten to use new functions and to deal with
  450 	external URLs being mixed in, and generally cleaned up.
  451 	(handle_url): Moved --internal-only checks here.
  452 	(check_external): Removed.
  453 	(check_links): Renamed from check_internal.
  454 	Added small blurb to documentation on distinction between internal
  455 	and external links and the way checkbot checks these.
  456 
  457 	* t/test.t: Added simple test case: can checkbot be run without
  458 	arguments?
  459 
  460 2002-12-25  Hans de Graaff  <hans@degraaff.org>
  461 
  462 	* Checkbot 1.69 released
  463 
  464 2002-12-25  Hans de Graaff  <hans@degraaff.org>
  465 
  466 	* checkbot (get_headers): Make sure feedback on HEAD requests gets
  467 	indented properly.
  468 
  469 2002-12-23  Hans de Graaff  <hans@degraaff.org>
  470 
  471 	* checkbot (init_globals): Anchor automatic match argument based
  472 	on start URLs at the beginning.
  473 
  474 2002-12-16  Jens Schweikhardt  <schweikh@schweikhardt.net>
  475 
  476 	* checkbot (check_external): Fixed printf to be print so that
  477 	actual information can be printed using --verbose.
  478 
  479 2002-12-02  Hans de Graaff  <hans@degraaff.org>
  480 
  481 	* checkbot (get_headers): Also add 406 as an error which might
  482 	indicate that the web server doesn't like us doing a HEAD, so GET
  483 	instead.
  484 
  485 2002-12-01  Hans de Graaff  <hans@degraaff.org>
  486 
  487 	* Makefile.PL: Updated based on libwww-perl Makefile.PL.
  488 
  489 	* checkbot: Remove the preamble cruft and just assume perl will be
  490 	/usr/bin/perl. Therefore also renamed checkbot.pl -> checkbot.
  491 	Indicate that Checkbot is licensed under the same terms as Perl
  492 	itself.
  493 
  494 	* checkbot.pl (count_problems): Rewrote debugging code to handle
  495 	request without header() method, even though this should not be
  496 	possible it does happen in the wild. 
  497 	(handle_doc): Perform fully-qualified hostname check for all URI's
  498 	which support a hostname.
  499 
  500 2002-11-30  Hans de Graaff  <hans@degraaff.org>
  501 
  502 	* checkbot.pl (add_checked): Use ->can construct to check if URL
  503 	supports host method.
  504 
  505 2002-10-27  Hans de Graaff  <hans@degraaff.org>
  506 
  507 	* checkbot.pl: Add hints for recursive or run-away checkbot
  508 	processes.
  509 
  510 2002-09-28  Hans de Graaff  <hans@degraaff.org>
  511 
  512 	* Checkbot 1.68 released
  513 
  514 2002-08-05  Hans de Graaff  <hans@degraaff.org>
  515 
  516 	* checkbot.pl (handle_doc): Comment out warning about external
  517 	URLs with non-checkable schemes to avoid lots of useless output.
  518 
  519 2002-06-09  Jostle Lemcke  <jostle@users.sourceforge.net>
  520 
  521 	* checkbot.pl: Added --allow-simple-hosts option. This option
  522 	turns off the warnings for unqualified host names.
  523 
  524 2002-04-01  Hans de Graaff  <hans@degraaff.org>
  525 
  526 	* checkbot.pl (handle_doc): Ignore URLs found in <base>
  527 	tags. Suggestion from Roman Maeder.
  528 
  529 2002-03-31  Hans de Graaff  <hans@degraaff.org>
  530 
  531 	* checkbot.pl (print_help): Mention --style option in help message.
  532 	(check_internal): Always close CURRENT filehandle, and add warn
  533 	for potential problems with this based on patch and report from
  534 	Greg Larkin.
  535 
  536 	* checkbot.pl: Added HINTS AND TIPS section to
  537 	documentation. Added hint on using passive FTP based on feedback
  538 	from Roman Maeder.
  539 
  540 2002-03-31  Brent Verner  <brent@rcfile.org>
  541 
  542 	* checkbot.pl (handle_doc): Only match http and https, not stuff
  543 	like httpa.
  544 
  545 2002-03-31  Paco Hope  <paco@paco.to>
  546 
  547 	* checkbot.css: Contributed style sheet for Checkbot. Use with
  548 	--style option.
  549 
  550 2002-01-20  Roman Maeder  <maeder@mathconsult.ch>
  551 
  552 	* checkbot.pl (handle_url): Use select() to sleep instead of
  553 	sleep() so that sleep interval can be fractional.
  554 
  555 2001-12-16  Hans de Graaff  <hans@degraaff.org>
  556 
  557 	* Checkbot 1.67 released
  558 
  559 2001-11-16  Hans de Graaff  <hans@degraaff.org>
  560 
  561 	* checkbot.pl: Add example for --match argument based on question
  562 	by Michael Lambert.
  563 
  564 2001-11-11  Hans de Graaff  <hans@degraaff.org>
  565 
  566 	* checkbot.pl (count_problems): Quote meta characters in server
  567 	name and URL when matching them.
  568 	(handle_doc): Fix two minor bugs related to the move to URI.
  569 
  570 2001-11-11  Evaldas Imbrasas  <evaldas@wolfram.com>
  571 
  572 	* checkbot.pl: Add --language option to allow language
  573 	negotiation.
  574 
  575 	* checkbot.pl (check_options): Set default for --sleep option to 0.
  576 
  577 	* checkbot.pl (check_internal): Only close <CURRENT> if it already
  578 	exists.
  579 	
  580 2001-11-03  Hans de Graaff  <hans@degraaff.org>
  581 
  582 	* checkbot.pl (printServerProblems): There might not be a response
  583 	message.
  584 	(handle_url): Use status_line instead of code and message for
  585 	HTTP::Response object.
  586 	(handle_doc): Also check external gopher links.
  587 
  588 2001-10-25  Hans de Graaff  <hans@degraaff.org>
  589 
  590 	* Checkbot 1.66 released
  591 
  592 	* checkbot.pl (get_headers): URI doesn't know about netloc, but it
  593 	does know about authority.
  594 	(get_headers): $url is already absolute, no need for ->abs
  595 
  596 2001-10-18  Hans de Graaff  <hans@degraaff.org>
  597 
  598 	* Checkbot 1.65 released
  599 
  600 2001-10-14  Hans de Graaff  <hans@degraaff.org>
  601 
  602 	* checkbot.pl (handle_doc): Print a notice when external non
  603 	HTTP/FTP URLs are dropped.
  604 
  605 2001-09-29  Hans de Graaff  <hans@degraaff.org>
  606 
  607 	* checkbot.pl (init_modules and other places): Remove
  608 	URI::URL::strict call and use of new URI::URL because it is
  609 	obsolete, we should use the URI classes now.
  610 
  611 2001-09-23  Hans de Graaff  <hans@degraaff.org>
  612 
  613 	* checkbot.pl (init_globals): Initialize last checkpoint time with
  614 	0 instead of current time, so that we write out a set of pages
  615 	right at the start. This will catch problems with permissions for
  616 	these pages as early as possible.
  617 
  618 2001-07-01  Hans de Graaff  <hans@degraaff.org>
  619 
  620 	* checkbot.pl (get_server_type): Take into account that we might
  621 	not learn anything about the server
  622 
  623 2001-05-06  Hans de Graaff  <hans@degraaff.org>
  624 
  625 	* checkbot.pl (get_headers): Factored out of check_external so
  626 	that moving to using GET requests only will be easier later.
  627 
  628 2001-04-30  Hans de Graaff  <hans@degraaff.org>
  629 
  630 	* checkbot.pl (send_mail): Really fix printing of starting URLs in
  631 	email. All URLs are now printed in the subject and body of the
  632 	message.
  633 
  634 2001-04-15  Hans de Graaff  <hans@degraaff.org>
  635 
  636 	* Checkbot 1.64 released
  637 
  638 2001-03-13  Hans de Graaff  <hans@degraaff.org>
  639 
  640 	* checkbot.pl (send_mail): Fix printing of starting URL in email.
  641 
  642 2001-03-04  Nick Hibma <n_hibma@qubesoft.com>
  643 
  644 	* checkbot.pl (printServerWarnings): Removed duplicate print statement.
  645 
  646 2001-02-10  Boris Lantrewitz <lantrewi@do.isst.fhg.de>
  647 
  648 	* checkbot.pl (init_globals): Allow more environment variables to
  649 	be used to set the temporary directory.
  650 	(send_mail): Avoid using printf to the handle for those systems
  651 	where printf on a pipe is not implemented.
  652 
  653 2001-01-14  Hans de Graaff  <hans@degraaff.org>
  654 
  655 	* Checkbot 1.63 released
  656 
  657 2001-01-02  Hans de Graaff  <hans@degraaff.org>
  658 
  659 	* Makefile.PL (chk_version): Require LWP 5.50, which contains an
  660 	important bugfix when dealing with relative redirects.
  661 
  662 2001-01-01  Hans de Graaff  <hans@degraaff.org>
  663 
  664 	* checkbot.pl (init_globals): If no --match is given, construct
  665 	one based on all the start URLs given. Suggested by Mathieu
  666 	Guillaume.
  667 
  668 2000-12-31  Hans de Graaff  <hans@degraaff.org>
  669 
  670 	* checkbot.pl (create_page): Remove the .bak file when the new
  671 	file is written, unless --debug is in effect.
  672 
  673 2000-12-31  OBARA Kiyotake <obara@vc-net.ne.jp>
  674 
  675 	* checkbot.pl (print_server): Create correct URLs when --file
  676 	argument contains directories as well as a filename.
  677 
  678 2000-12-31  David Brownlee <abs@purplei.com>
  679 
  680 	* checkbot.pl (create_page): Fix typo in die message.
  681 
  682 2000-12-24  Hans de Graaff  <hans@degraaff.org>
  683 
  684 	* checkbot.pl: Added a small blurb in the documentation about the
  685 	URLs Checkbot will find and check.
  686 
  687 2000-12-24  Petter Reinholdtsen <pere@hungry.com>
  688 
  689 	* checkbot.pl (handle_url): Deal with redirect responses without
  690 	Location header.
  691 
  692 2000-11-18  Roman Maeder  <maeder@mathconsult.ch>
  693 
  694 	* checkbot.pl (handle_url): Remove check which would not check
  695 	files named the same as the main report file. If you don't want
  696 	Checkbot to check its intermediate pages, use the --exclude
  697 	option.
  698 
  699 	* checkbot.pl (handle_url): Ask server for file type when
  700 	requesting http:// URLs to be on the safe side, as using
  701 	guess_media_type() is not always correct.
  702 
  703 2000-10-28  Nick Hibma  <n_hibma@qubesoft.com>
  704 
  705 	* checkbot.pl (check_external): Only print when --verbose is true.
  706 	(printServerProblems): Fix proper printing of <hr>.
  707 	(handle_doc): Include proper URL for report for unqualified URLs.
  708 
  709 2000-10-01  TAKAKU Masao  <masao@ulis.ac.jp>
  710 
  711 	* checkbot.pl (print_server): Make pages well-formed by inserting
  712 	<html> and <body> tags.
  713 
  714 2000-09-24  Hans de Graaff  <hans@degraaff.org>
  715 
  716 	* Checkbot 1.62 released
  717 
  718 2000-09-16  Hans de Graaff  <hans@degraaff.org>
  719 
  720 	* checkbot.pl (send_mail): Only mention URL in the subject of the
  721 	mail if one is given through the --url option.
  722 	(check_external): The ALEPH web server is also broken with respect
  723 	to HEAD requests.
  724 
  725 2000-09-04  Hans de Graaff  <hans@degraaff.org>
  726 
  727 	* checkbot.pl (check_external): JavaWebServer is also broken with
  728 	respect to HEAD requests.
  729 
  730 2000-08-26  Hans de Graaff  <hans@degraaff.org>
  731 
  732 	* checkbot.pl (create_page): Add --style option which allows a
  733 	link to a CSS file to be included in each Checkbot page.
  734 
  735 2000-08-20  Nick Hibma  <n_hibma@qubesoft.com>
  736 
  737 	* checkbot.pl (check_external): Some servers don't set the Server:
  738 	header. Check to see if the server field is set in a response to
  739 	avoid warnings.
  740 
  741 	* checkbot.pl (add_checked): Add --enable-virtual option to use
  742 	hostname instead of IP address to distinguish servers. This allows
  743 	checking of multiple virtual servers.
  744 
  745 2000-08-13  Hans de Graaff  <hans@degraaff.org>
  746 
  747 	* Makefile.PL: Add a check for HTML::Parser. Require latest
  748 	version, 3.10, because I'm not sure older versions work correctly.
  749 
  750 2000-06-29  Hans de Graaff  <hans@degraaff.org>
  751 
  752 	* Checkbot 1.61 released
  753 
  754 	* Makefile.PL (chk_version): Add version checked for in output.
  755 
  756 2000-06-18  Larry Gilbert <larry@n2h2.com>
  757 
  758 	* checkbot.pl (check_external): Use GET instead of HEAD for
  759 	confused closed-source servers.
  760 
  761 2000-06-18  Hans de Graaff  <hans@degraaff.org>
  762 
  763 	* Makefile.PL (chk_version): require URI 1.07 as it contains bug
  764 	fixes for using Base URLs.
  765 
  766 	* checkbot.pl: Change email and web address
  767 
  768 2000-04-30  Hans de Graaff <graaff@xs4all.nl>
  769 
  770 	* Checkbot 1.60 released
  771 
  772 	* checkbot.pl (check_options): Add option --dontwarn to exclude
  773 	certain types of warnings. Based on idea by David Hoekman.
  774 
  775 2000-04-29  Mark Roedel <roedelm@letu.edu>
  776 
  777 	* checkbot.pl (handle_url): Deal with "300 Multiple Choices"
  778 	response which does not offer a URL to redirect to.
  779 
  780 2000-04-09  David Hoekman <dhoekman@halcyon.com>
  781 
  782 	* checkbot.pl (init_globals): Allow for TMPDIR with or without
  783 	trailing /
  784 
  785 2000-04-08  Hans de Graaff  <Hans de Graaff <graaff@xs4all.nl>>
  786 
  787 	* checkbot.pl: Updated contact information in file header.
  788 
  789 2000-03-26  Hans de Graaff  <graaff@xs4all.nl>
  790 
  791 	* checkbot.pl (check_options): Add message about skipping of
  792 	external links. Also removes warning about single use of variable.
  793 
  794 2000-03-06  Brian McNett <webmaster@mycoinfo.com>
  795 
  796 	* checkbot.pl: On a Mac, ask command line options
  797 	through MacPerl mechanism.
  798 
  799 2000-02-06  Hans de Graaff  <graaff@xs4all.nl>
  800 
  801 	* checkbot.pl (init_globals): Check wether URLs on the command
  802 	line have a proper host. Thanks to Charles Williams for the
  803 	report.
  804 
  805 2000-01-30  Hans de Graaff  <graaff@xs4all.nl>
  806 
  807 	* Checkbot 1.59 released
  808 
  809 	* checkbot.pl (handle_doc): Use eof instead of parse(undef) to end
  810 	parsing.
  811 
  812 2000-01-15  Hans de Graaff  <graaff@xs4all.nl>
  813 
  814 	* checkbot.pl (handle_doc): Show warnings about hostnames only on
  815 	the console when --verbose.
  816 
  817 2000-01-09  Hans de Graaff  <graaff@xs4all.nl>
  818 
  819 	* checkbot.pl: Added option --internal-only to skip checking of
  820 	external links altogether. Idea by David Hoekman
  821 	<dhoekman@halcyon.com>
  822 
  823 2000-01-02  Hans de Graaff  <graaff@xs4all.nl>
  824 
  825 	* checkbot.pl (handle_doc): Use canonical URI from LinkExtor,
  826 	which simplifies the rest of the logic and gets things working
  827 	with the new version of LinkExtor.
  828 
  829 2000-01-01  Stephane Bortzmeyer <bortzmeyer@pasteur.fr>
  830 
  831 	* checkbot.pl (init_globals): Create Checkbot workdir in $TMPDIR
  832 	if defined, /tmp otherwise.
  833 
  834 1999-12-31  Hans de Graaff  <graaff@xs4all.nl>
  835 
  836 	* checkbot.pl (handle_doc): Change frag to fragment.
  837 
  838 1999-11-07  Hans de Graaff  <graaff@xs4all.nl>
  839 
  840 	* checkbot.pl (handle_doc): Add warning for URLs for which LWP
  841 	can't determine a hostname, and don't check them further.
  842 
  843 1999-10-24  Hans de Graaff  <graaff@xs4all.nl>
  844 
  845 	* checkbot.pl (print_help): Added line on --interval option.
  846 
  847 1999-10-23  Hans de Graaff  <graaff@xs4all.nl>
  848 
  849 	* checkbot.pl (init_globals): Fixed proper determination of server
  850 	prefix if a filename is supplied, thanks to Michael Baumer.
  851 
  852 1999-10-02  Hans de Graaff  <graaff@xs4all.nl>
  853 
  854 	* checkbot.pl (init_modules): Added use URI.
  855 
  856 1999-08-21  Hans de Graaff  <graaff@xs4all.nl>
  857 
  858 	* Makefile.PL (chk_version): Added check for URI.
  859 
  860 1999-07-17  Hans de Graaff  <graaff@xs4all.nl>
  861 
  862 	* README: Added blurb on the announcements mailing list.
  863 
  864 1999-07-06  Hans de Graaff  <graaff@xs4all.nl>
  865 
  866 	* checkbot.pl (add_checked): Deal with the fact that a mailto: URL 
  867 	has no host component. Thanks to John Croft for the report.
  868 
  869 1999-06-27  Hans de Graaff  <graaff@xs4all.nl>
  870 
  871 	* checkbot.pl (handle_url): Really fix relative redirection URLs
  872 	using the URI class. Thanks for Thomas Zander for the report and
  873 	reproducible failing URL.
  874 
  875 1999-05-03  Hans de Graaff  <graaff@xs4all.nl>
  876 
  877 	* checkbot.pl (printServerWarnings): Also change clustering of URLs.
  878 
  879 1999-05-02  Hans de Graaff <graaff@xs4all.nl>
  880 
  881 	* checkbot.pl (signature): Add quotes around the URL in the
  882 	signature.
  883 	(printServerProblems): Fixed clustering of URLs so that faulty
  884 	links are listed under the URL that contains them, instead of the
  885 	other way around. This ordering problem was introduced in 1.53.
  886 
  887 1999-04-10  Hans de Graaff  <graaff@xs4all.nl>
  888 
  889 	* checkbot.pl (handle_url): Make sure a redirected URL is fully
  890 	qualified (based on the original URL) to avoid dying on it
  891 	later. Thanks to David Hoekman for the initial analysis.
  892 
  893 1999-04-05  Hans de Graaff <graaff@xs4all.nl>
  894 
  895 	* checkbot.pl (printAllServers): Taken out of create_page for
  896 	clarity.
  897 	(printServerWarnings): Keep warning headers from being printed for 
  898 	each warning.
  899 
  900 1999-03-15  Hans de Graaff  <graaff@xs4all.nl>
  901 
  902 	* README: Explain which Perl modules are needed.
  903 
  904 1999-02-20  Hans de Graaff  <graaff@xs4all.nl>
  905 
  906 	* checkbot.pl (printServerWarnings): Fix printing of warnings so
  907 	that headers are only printed once.
  908 	(print_server): get correct IP address for web servers with
  909 	non-standard port numbers.
  910 
  911 1999-02-08  Hans de Graaff  <graaff@xs4all.nl>
  912 
  913 	* Makefile.PL (chk_version): Added location of Mail::Send.
  914 
  915 1999-01-18  Hans de Graaff  <graaff@xs4all.nl>
  916 
  917 	* checkbot.pl (count_problems): Change counting of problems to
  918 	deal with new structure.
  919 
  920 1999-01-17  Hans de Graaff  <graaff@xs4all.nl>
  921 
  922 	* checkbot.pl (printServerProblems): Changed to accomodate new
  923  	inventory of problem response. This new method allow multiple bad
  924  	links to one URL be all reported all at once. Also use
  925  	standardized response descriptions based on a patch by Benjamin
  926  	Franz <snowhare@nihongo.org>.
  927 
  928 1999-01-10  Hans de Graaff  <graaff@xs4all.nl>
  929 
  930 	* checkbot.pl (byReferringPage): Added to allow sorting of
  931 	problems by referer.
  932 	(byProblem): Removed code to compare by exact message and
  933 	referer.
  934 	Removed the pre-amble to generate correct perl path because it is
  935 	a bit too cumbersome during development.
  936 
  937 1998-12-31  Hans de Graaff  <graaff@xs4all.nl>
  938 
  939 	* checkbot.pl (handle_url): Do a HEAD request when the guessed
  940  	content-type matches application/octet-stream to get the real
  941  	content-type from the server.
  942 
  943 1998-12-27  Hans de Graaff <graaff@xs4all.nl>
  944 
  945 	* checkbot.pl (handle_doc): Added warning for HTTP URLs without a
  946 	fully-qualified hostname.
  947 
  948 	* checkbot.pl (printServerWarnings): Added a mechanism to also
  949  	display checkbot warnings, unrelated to the HTTP responses, on the
  950  	results pages.
  951 
  952 1998-10-24  Hans de Graaff <graaff@xs4all.nl>
  953 
  954 	* checkbot.pl (setup): Explicitly set record separator $/
  955 	This appears needed for perl 5.005, and fixes a problem
  956 	where no URLs would appear to match except the first few.
  957 
  958 1998-10-10  Hans de Graaff  <graaff@xs4all.nl>
  959 
  960 	* checkbot.pl: Made POD conform to new scripts format better.
  961 
  962 1998-06-21  Hans de Graaff  <graaff@xs4all.nl>
  963 
  964 	* checkbot.pl (init_modules): HTML::Parse is no longer needed,
  965 	removed.
  966 
  967 Sat Sep  6 16:00:12 1997  Hans de Graaff  <graaff@xs4all.nl>
  968 
  969 	* checkbot 1.51 released
  970 
  971 Sat Aug 30 18:05:39 1997  Hans de Graaff  <graaff@xs4all.nl>
  972 
  973 	* checkbot.pl (init_globals): assume file: scheme when no scheme
  974 	is present.
  975 
  976 	* checkbot.pl: Small portability stuff for perl 5.004 and LWP 5.11.
  977 
  978 Sun Aug 17 08:56:38 1997  Hans de Graaff  <graaff@xs4all.nl>
  979 
  980 	* README: Changed email addresses to point to new ISP.
  981 
  982 Mon Apr 28 09:08:29 1997  Hans de Graaff  <graaff@xs4all.nl>
  983 
  984 	* checkbot.pl: Parsing VERSION is somewhat tricky. Fixed.
  985 
  986 Sun Apr 27 21:02:58 1997  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
  987 
  988 	* checkbot.pl (check_external): Close EXTERNAL after use.
  989 
  990 Sun Apr 20 10:24:09 1997  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
  991 
  992 	* checkbot.pl: Fixed a number of small bugs reported by Jost Krieger.
  993 	Regular expressions can now be used with the options.
  994 	Added --interval option to denote maximum interval between updates.
  995 
  996 Sat Apr  5 17:03:46 1997  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
  997 
  998 	* checkbot.pl (init_globals): Added checks for URLs without a scheme.
  999 
 1000 Fri Mar 14 11:17:21 1997  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1001 
 1002 	* checkbot.pl (print_help): Fix typo.
 1003 
 1004 Tue Jan 14 16:51:36 1997  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1005 
 1006 	* checkbot.pl (check_internal): Check whether there are really
 1007 	entries in the new queue when changing queues.
 1008 
 1009 Sat Jan  4 14:26:04 1997  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1010 
 1011 	* checkbot.pl (print_help): --seconds should be --sleep in help.
 1012 
 1013 Mon Dec 30 12:03:14 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1014 
 1015 	* checkbot.pl (handle_url): If a URL is exclude'd, only use HEAD
 1016 	on it, not GET.
 1017 	Starting URLs can now be entered on the command line in addition
 1018 	to the --url option. --url takes precedence. --match is
 1019 	initialized with first URL if not given as separate option.
 1020 
 1021 Mon Dec 23 20:21:32 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1022 
 1023 	* checkbot.pl (print_server_problems): Each error message was
 1024 	evaluated as a regexp, potentially crashing checkbot on a bad
 1025 	regexp (e.g. including the string '++').
 1026 
 1027 Mon Dec 23 15:15:05 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1028 
 1029 	* checkbot.pl (ip_address): Deal with IP-address not found.
 1030 
 1031 Sun Dec  8 12:55:33 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1032 
 1033 	* checkbot.pl (send_mail): --note didn't work; Checkbot would
 1034 	crash when no external links were found.
 1035 
 1036 Wed Dec  4 12:43:14 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1037 
 1038 	* checkbot.pl (add_checked): All checked URLs are indexed using
 1039 	  IP-address to avoid checking pages multiple times for multiple
 1040 	  CNAME's.
 1041 
 1042 Mon Nov  4 14:19:30 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1043 
 1044 	* checkbot.pl (send_mail): Braino in URL fixed.
 1045 
 1046 Sun Oct 27 20:16:38 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1047 
 1048 	* checkbot.pl (init_globals): Don't let --match default to the
 1049 	--url until after we possible change the URL (this happens for
 1050 	file:/ URLs, currently)
 1051 
 1052 Wed Oct 23 14:22:15 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1053 
 1054 	* checkbot.pl (check_point): Oops, checking would occur every minute
 1055 
 1056 Mon Oct 21 13:41:48 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1057 
 1058 	* checkbot.pl (print_help): Added version number to help info.
 1059 
 1060 Wed Oct 16 21:05:58 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1061 
 1062 	* checkbot.pl: Added --proxy option for checking external links
 1063 	through a proxy server
 1064 
 1065 Sat Sep 28 09:26:48 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1066 
 1067 	* checkbot.pl (init_globals): Changed /var/tmp to /tmp.
 1068 	(check_point): Slower exponential rate, upper limit of 3 hours
 1069 
 1070 	* Makefile.PL: Added check for Mail::Send
 1071 
 1072 	* README: Added
 1073 
 1074 Thu Sep 26 17:01:36 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1075 
 1076 	* checkbot.pl: Switched from short options to long options.
 1077 	I was already running out of meaningful options, so before adding
 1078 	additional stuff I wanted to move to Long options first. You
 1079 	should be able to abbreviate most options to the previous values. 
 1080 	Notable exception is -m, which has become --match.
 1081 
 1082 Wed Sep 25 10:58:06 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1083 
 1084 	* checkbot.pl: 
 1085 	Renamed from checkbot
 1086 	Added preamble to set proper path for perl (code from Gisle Aas)
 1087 
 1088 	* Makefile.PL: First version, installs checkbot and checkbot.1
 1089 
 1090 	* checkbot: Changed $revision to $VERSION for MakeMaker.
 1091 
 1092 Thu Sep 12 15:09:07 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1093 
 1094 	* index.html: updated required modules and location.
 1095 
 1096 	* checkbot: require LWP-5.02, because it fixes a few nasty bugs.
 1097 
 1098 Thu Sep  5 16:00:42 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1099 
 1100 	* index.html: 
 1101 	Removed old and out-of-date documentation. Replaced by link to
 1102 	automatically generated html version of POD documentation
 1103 	within Checkbot.
 1104 
 1105 	* checkbot:
 1106 	Fixed documentation bugs.
 1107 	Really fix the case insensitive comparison.
 1108 
 1109 Sun Sep  1 20:31:46 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1110 
 1111 	* checkbot (print_server_problems): 
 1112 	Make comparison for error message case insensitive.
 1113 
 1114 Fri Aug 30 20:19:56 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1115 
 1116 	* checkbot: Fixed several typo's.
 1117 
 1118 Wed Aug  7 10:06:29 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1119 
 1120 	* checkbot (handle_doc): 
 1121 	The new LinkExtractor is nice, but I shouldn't treat its output as
 1122 	a hash when it is an array, and thus skipping every other link.
 1123 
 1124 Mon Aug  5 08:46:24 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1125 
 1126 	* checkbot (print_server): 
 1127 	Fixed silly bug in calculating the percentage of problems on each
 1128 	server.
 1129 
 1130 Fri Aug  2 21:38:39 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1131 
 1132 	* checkbot: Added several patches by Bruce Speyer:
 1133 	Added -N note option to go along with -M, -z to suppress reporting
 1134 	errors on matching links.
 1135 	Added enough logic to catch gopher URLS if no gopher server found.
 1136 	Need further logic to parse gopher returned menu for bad file or
 1137 	directory.
 1138 
 1139 	* checkbot: Made a good start with POD documentation inside the
 1140 	checkbot file. Try 'perldoc checkbot'.
 1141 
 1142 	* TODO: Added number of suggestions by Luuk de Boer.
 1143 
 1144 	* checkbot (send_mail): Include summary of links checked in message.
 1145 
 1146 Fri Aug  2 13:01:02 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1147 
 1148 	* checkbot: 
 1149 	Added check for correct LWP version. We now need 5.01, due to bugs
 1150 	in the handling of the BASE attribute in previous versions.
 1151 
 1152 Sat Jul 27 21:13:26 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1153 
 1154 	* checkbot: 
 1155 	Added several patches by Bruce Speyer:
 1156 	Optimized some static regular expressions.
 1157 	Fixed not setting the timeout, making the -t option useless.
 1158 
 1159 Mon Jul 22 22:28:34 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1160 
 1161 	* checkbot (create_page): 
 1162 	Fixed number of columns in summary output.
 1163 
 1164 Sat Jul 20 11:49:23 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1165 
 1166 	* checkbot (handle_doc): Changed to use the new HTML::LinkExtor,
 1167 	which will be present in LWP5.01. Should be more efficient, and
 1168 	less prone to memory leaks.
 1169 
 1170 Sat Jul 13 12:41:23 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1171 
 1172 	* checkbot (create_page): Forgot to add the ratio on the page.
 1173 	(check_external): Fix problems with different `wc` output.
 1174 
 1175 Sat Jun 22 11:30:12 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1176 
 1177 	* checkbot: Use correct base URL as returned with the document.
 1178 	Only check document when we used 'GET' to receive it.
 1179 	Remove magic guessing with ending slash of starting url.
 1180 	Deal with redirections by inserting redirected URLs into queue
 1181 	again.
 1182 
 1183 Thu Jun 20 15:58:20 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1184 
 1185 	* checkbot: Major cleanup of initialization code. Also added todo
 1186 	counts to progression page, and proper todo handling for external
 1187 	links.
 1188 
 1189 Sun Jun 16 21:16:28 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1190 
 1191 	* checkbot: Added -M option: send mail when Checkbot is done.
 1192 	Fixed division by zero bug when external links == 0
 1193 
 1194 Tue Jun  4 12:46:39 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1195 
 1196 	* checkbot: Better way to ignore fragments.
 1197 
 1198 Sat Jun  1 15:14:52 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1199 
 1200 	* checkbot: Don't print decimals with the precentages.
 1201 	Major update of counting, and printing counts. Cleaned up
 1202 	variables, corrected counting, made display more consistent and
 1203 	clear.
 1204 
 1205 Wed May 29 21:18:26 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1206 
 1207 	* checkbot: Small fixes to support lwp-win32 as well, thanks to
 1208 	Martin Cleaver.
 1209 
 1210 Mon May 27 09:21:30 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1211 
 1212 	* checkbot: oops, small error in regexp caused script to append a
 1213 	slash to almost all start-url's. Fixed.
 1214 	
 1215 	* checkbot (handle_doc): External links without full URL's were
 1216 	not always handled properly.
 1217 
 1218 Sun May 26 10:04:39 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1219 
 1220 	* checkbot: If the starting URL doesn't end in a slash, and
 1221 	doesn't have an extension, assume we need to add a slash.
 1222 
 1223 	* index.html: Add version number to web page, and make sure it gets
 1224 	updated automatically.
 1225 
 1226 Wed May 22 09:58:36 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1227 
 1228 	* checkbot: Changed verbose output of links found on pages.
 1229 
 1230 Tue May 14 16:43:38 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1231 
 1232 	* TODO: updated with respect to recent changes.
 1233 
 1234 Mon May 13 15:06:05 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1235 
 1236 	* checkbot: Added LWP version number to agent field, changed page
 1237 	update policy, and updated script to LWP5b13.
 1238 
 1239 Sat May  4 21:38:56 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1240 
 1241 	* checkbot: Changed checked array to an associative array. Will
 1242 	consume more memory, but drastically cut back on lookup time.
 1243 	
 1244 	Rewrote handle_url logic to be more clear. Also fixed bug where
 1245 	servers would be added to the list unjustly.
 1246 
 1247 	Sleep was only done on problem links, not after each request.
 1248 
 1249 	Also added checks for already checked links while scanning through
 1250 	the document, and only add those links not checked to the queue.
 1251 
 1252 	Add percentage problem links for each individual server.
 1253 
 1254 Mon Apr 29 08:43:12 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1255 
 1256 	* checkbot: Deal with unknown or non-determinable server types.
 1257 	
 1258 	Only add links to the external queue when we know we can check
 1259 	their protocol.
 1260 
 1261 	Additional changes to layout and content of pages.
 1262 
 1263 Sun Apr 28 21:16:51 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1264 
 1265 	* checkbot: Rewrote report page.
 1266 
 1267 Wed Apr 24 22:39:43 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1268 
 1269 	* checkbot: Added a number of patches by Tim MacKenzie
 1270 	Added -s option to set the seconds of sleep between requests.
 1271 	Remove work files when *not* debugging.
 1272 	Only compile -m and -x regular expressions once.
 1273 	Also check external ftp and nntp links (using HEAD only).
 1274 	Get rid of huge memory leak! (Also noted by Fabrice Gaillard)
 1275 
 1276 Fri Mar 29 10:58:24 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1277 
 1278 	* checkbot: 
 1279 	Got rid of warnings about some variables.
 1280 	Fixed problem with incorrect automatic -m argument when scanning
 1281 	local files.
 1282 
 1283 Sun Mar 24 18:01:05 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1284 
 1285 	* checkbot:
 1286 	Added code to support regular expressions with the -m and -x
 1287 	arguments. Thanks to Thomas Thiel for the patch and suggestions.
 1288 	
 1289 	No strict checking on schemes, fixes problem with unknown schemes
 1290 	stopping checkbot. Thanks to Pierre-Yves Foucou.
 1291 
 1292 	* checkbot: 
 1293 	Should create direcory for temporary files, and remove it
 1294 	afterwards. Noted by Steve Fisk.
 1295 
 1296 Sat Mar 16 13:40:48 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1297 
 1298 	* checkbot: 
 1299 	Made a number of changes from or based on patches by Thomas Thiel:
 1300 
 1301 	Added missing t option in Getopts string.
 1302 
 1303 	Made -m argument optional. If not given, the -u argument is also
 1304 	used as the start argument.
 1305 
 1306 	Temporary files are now created in a separate directory. Its name
 1307 	contains the PID of Checkbot, to allow several concurrent
 1308 	Checkbots being run. Also remove temporary files, unless
 1309 	debugging.
 1310 
 1311 	Implement file:// scheme to allow direct checking (without HTTP
 1312         server)
 1313 
 1314 Fri Mar 15 11:06:13 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1315 
 1316 	* checkbot: 
 1317 	Fixed warnings (and in the process, a small bug as well).
 1318 	Added URL and proper name to help.
 1319 
 1320 Sat Mar  2 11:51:45 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1321 
 1322 	* checkbot: 
 1323 	Added 'require 5.002' (because libwww-perl5b8 requires it).
 1324 	Added 'use strict', and fixed problems resulting from this. This
 1325 	can be seen as a first step towards fixing the huge
 1326 	memory-consumption.
 1327 	Updated help.
 1328 
 1329 Tue Feb 27 09:57:57 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1330 
 1331 	* checkbot:
 1332 	Fixed bug which occured when -x option was not present.
 1333 	Updated script to use libwww-perl5b8 function names. This is not
 1334 	backward compatible with versions prior to beta 8.
 1335 
 1336 Mon Feb 26 12:46:08 1996  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1337 
 1338 	* checkbot:
 1339 	Fixed bug with Referer header for external URL's.
 1340 	Also make server pages auto-refresh.
 1341 
 1342 Sat Feb 24 11:48:15 1996  Hans de Graaff  <Hans.deGraaff@twi72.twi.tudelft.nl>
 1343 
 1344 	* TODO: New file.
 1345 
 1346 	* checkbot: Added single -x option as an additional exclude pattern.
 1347 	This overrules the -m match attribute.
 1348 
 1349 Mon Dec 11 14:13:30 1995  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1350 
 1351 	* index.html
 1352 	Added libwww-perl5 address, and added a usage section.
 1353 
 1354 	* checkbot.pl
 1355 	Removed this old perl4 version.
 1356 
 1357 Fri Dec  8 13:41:43 1995  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1358 
 1359 	* checkbot: 
 1360 	Major rewrite of most of the internal routines. The routines are
 1361 	much more structured now, and broken up into smaller routines.
 1362 	I also changed the way checked links are remembered. It should be
 1363 	much less efficient, CPU-wise, but more efficient memory-wise.
 1364 
 1365 Fri Nov 24 16:45:18 1995  Hans de Graaff  <J.J.deGraaff@twi.tudelft.nl>
 1366 
 1367 	* checkbot:
 1368 	Fixed small problems, mostly with output.
 1369 	Fixed checking of external links
 1370 	Changed sorting order
 1371 
 1372 	* checkbot: 
 1373 	Perl5 version now works for the most part. Although Checkbot isn't
 1374 	fully finished I at least feel confident to release it.
 1375 
 1376 Fri Aug 25 11:23:36 1995  Hans de Graaff  <graaff@is.twi.tudelft.nl>
 1377 
 1378 	* Made a start with the perl5 version of checkbot. The modules in
 1379 	perl5 (e.g. LWP) look very promising, and should make checkbot
 1380 	quite a bit better.