"Fossies" - the Fresh Open Source Software Archive 
Member "checkbot-1.80/ChangeLog" (15 Oct 2008, 45564 Bytes) of package /linux/www/old/checkbot-1.80.tar.gz:
As a special service "Fossies" has tried to format the requested text file into HTML format (style:
standard) with prefixed line numbers.
Alternatively you can here
view or
download the uninterpreted source code file.
1 2008-10-15 Hans de Graaff <hans@degraaff.org>
2
3 * Checkbot 1.80 is released
4
5 2008-07-08 Hans de Graaff <hans@degraaff.org>
6
7 * checkbot (handle_doc): Tighten up the check for a robots tag so
8 that nofollow text later in the document won't be matched, thus
9 skipping the whole document, bug 2005950.
10
11 2007-05-05 Brandon Bell <Brandon_Bell@bcit.ca>
12
13 * checkbot: mms scheme can be ignored safely.
14
15 2007-04-30 Hans de Graaff <hans@degraaff.org>
16
17 * checkbot (printAllServers): Clarify that 'Unique links' actually
18 is 'Documents scanned'.
19
20 2007-02-26 Hans de Graaff <hans@degraaff.org>
21
22 * checkbot (handle_doc): Handle the case where decoded_content is
23 not available as per bug 1665075.
24
25 2007-02-26 Gerald Preifer <gerald@pfeifer.com>
26
27 * checkbot (check_point): Simplify and add a comment.
28
29 2007-02-26 Hans de Graaff <hans@degraaff.org>
30
31 * Makefile.PL: Require LWP 5.803 or better. decoded_content got
32 added in 5.802 and 5.803 added some important bugfixes.
33
34 2007-02-03 Hans de Graaff <hans@degraaff.org>
35
36 * Checkbot 1.79 is released
37
38 * RELEASE-PROCESS: Add the release process documentation.
39
40 2007-01-27 Gerald Pfeifer <gerald@pfeifer.com>
41
42 * checkbot (init_suppression): Check and provide error if
43 suppression file is in fact a directory.
44
45 2006-12-28 Hans de Graaff <hans@degraaff.org>
46
47 * checkbot: Add summary to tables to make files XHTML 1.1 compliant.
48
49 2006-11-16 Hans de Graaff <hans@degraaff.org>
50
51 * checkbot (handle_doc): Parse the decoded content so that all
52 character set issues are dealt with before parsing. This solves
53 bug 1264729.
54
55 2006-11-14 Hans de Graaff <hans@degraaff.org>
56
57 * checkbot (performRequest): Simplify the code dealing with
58 problems of HEAD requests by retrying all 500 reponses instead of
59 special-cases particular failures that we happen to know
60 about. This type of problem is all to common, and if there really
61 is a problem GET will find it anyway.
62 (add_error): Allow regular expressions in the suppression
63 file. Based on patch from Eric Noack
64
65 2006-11-14 Eric Noack <en@lightwerk.com>
66
67 * checkbot (send_mail): Indicate how many errors are detected in
68 the notification email's subject.
69 (handle_doc): Use the URL with which the document was received for
70 the problem reports and internal accounting, but keep on using the
71 proper base URL as defined by the reponse object when retrieving
72 links from the document. This fixes the case where a weird BASE
73 URL in a document could make it unclear where the actual problem
74 was.
75
76 2006-10-28 Hans de Graaff <hans@degraaff.org>
77
78 * checkbot (performRequest): Handle case where an FTP server may
79 not be able to handle a HEAD request. This may cause a lot of data
80 to be transferred in those cases.
81
82 2006-05-03 Hans de Graaff <hans@degraaff.org>
83
84 * Checkbot 1.78 is released
85
86 2005-12-18 Hans de Graaff <hans@degraaff.org>
87
88 * checkbot (printServerProblems): Make pages XHTML compliant again.
89
90 2005-12-18 Jens Schweikhardt <schweikh@schweikhardt.net>
91
92 * checkbot: Add classes and ids so that more styling options for
93 CSS are available.
94 * checkbot2.css: Example CSS file using the new classes and ids.
95
96 2005-11-11 Hans de Graaff <hans@degraaff.org>
97
98 * checkbot: React in a more subtle way if the Time::Duration
99 module is not found.
100
101 2005-09-22 Hans de Graaff <hans@degraaff.org>
102
103 * Makefile.PL: Check for presence of Net::SSL and explain the
104 effects if this it not present.
105
106 2005-08-20 Hans de Graaff <hans@degraaff.org>
107
108 * checkbot (handle_doc): Ignore some 'links' found by LinkExtor
109 which do not need to link to live links. Fixed bugs #1264447 and
110 #1107832.
111
112 * test.html: Add test cases for it.
113
114 2005-08-06 Hans de Graaff <hans@degraaff.org>
115
116 * checkbot (performRequest): Switch from HEAD to GET on a 400
117 error, as the most likely cause is that the server has trouble
118 with HEAD requests.
119
120 2005-08-05 Hans de Graaff <hans@degraaff.org>
121
122 * checkbot (handle_doc): Also show how many new links are found on
123 a page, not just the total number of links.
124 (performRequest): Don't retry GET method on a 403 error.
125 (handle_doc): Properly handle newlines in the matches for title
126 and robots meta tag.
127
128 2005-07-28 Hans de Graaff <hans@degraaff.org>
129
130 * Checkbot 1.77 is released.
131
132 * checkbot: Fix use of $VERSION so that it compiles and can be
133 used by MakeMaker at the same time.
134 (handle_doc): Check for presence of robots meta tag and act on it.
135 Based on a patch by Donald Willingham.
136
137 2005-07-25 Hans de Graaff <hans@degraaff.org>
138
139 * Checkbot 1.76 is released.
140
141 2005-06-07 Hans de Graaff <hans@degraaff.org>
142
143 * checkbot (printServerProblems): Include title of page.
144 (handle_doc): Extract title for later printing.
145 Add new hash url_title to store page titles.
146 Based on a patch from John Bintz.
147
148 2005-04-23 Hans de Graaff <hans@degraaff.org>
149
150 * checkbot: Add documentation on use of file:/// URLs.
151
152 2005-01-23 Hans de Graaff <hans@degraaff.org>
153
154 * checkbot: Only send mail when Checkbot has detected any
155 problems, based on suggestion from Thomas Kuerten.
156
157 Print duration of run on final report, and refactor use of start
158 time variable to facilitate this. Feature depends on availability
159 of Time::Duration, but checkbot will work without it. Based on
160 patch from Adam Griff.
161
162 2005-01-23 Adam Griff <griff@computer.org>
163
164 * checkbot (create_page): Print out more options on results page.
165
166 2005-01-21 Hans de Graaff <hans@degraaff.org>
167
168 * checkbot: Remove automatic version number based on CVS version
169 now that commits will be more frequent than releases.
170
171 2004-11-12 Hans de Graaff <hans@degraaff.org>
172
173 * checkbot (handle_url): Ignore javascript: URLs instead of
174 generating a 904 error. It would be nice to handle these as well.
175
176 2004-05-26 Hans de Graaff <hans@degraaff.org>
177
178 * Makefile.PL: Sync HTML::Parser requirement with required
179 versions of libwww-perl.
180
181 2004-05-03 Hans de Graaff <hans@degraaff.org>
182
183 * checkbot: Write better documentation for --file option.
184
185 2004-04-26 Hans de Graaff <hans@degraaff.org>
186
187 * checkbot: Minor documentation changes thank to Jens
188 Schweikhardt.
189
190 2004-04-22 Hans de Graaff <hans@degraaff.org>
191
192 * Checkbot 1.75 is released.
193
194 2004-04-21 Hans de Graaff <hans@degraaff.org>
195
196 * checkbot (print_help): Use a here-doc for the help for easier
197 maintenance.
198 (init_modules): Add --noproxy options to set list of domains which
199 will not be passed through the proxy.
200
201 2004-04-18 Hans de Graaff <hans@degraaff.org>
202
203 * checkbot (handle_url): Create an error if an unknown scheme is
204 encountered and only ignore known schemes like mailto:
205
206 2004-03-30 Hans de Graaff <hans@degraaff.org>
207
208 * checkbot: Add explanation about error message which indicates
209 lack of SSL support.
210
211 2004-03-28 Hans de Graaff <hans@degraaff.org>
212
213 * checkbot: Add EXAMPLES section to the perldoc documentation with
214 an example of the most simple invocation. Needs more examples...
215 Update help text for --mailto to confirm that more than one
216 address is possible.
217
218 * checkbot: Add new --cookies option to accept cookies from
219 servers. Based on patch from Roger Pilkey.
220
221 2004-02-09 Hans de Graaff <hans@degraaff.org>
222
223 * Makefile.PL: Show correct text if LWP test fails.
224
225 2004-01-05 Hans de Graaff <hans@degraaff.org>
226
227 * Makefile.PL: Now require LWP 5.76 to avoid problems with 500
228 "Need a field name" HTTP errors being generated by LWP.
229
230 2003-12-29 Gerald Pfeifer <gerald@pfeifer.com>
231
232 * checkbot: Improve description of --proxy.
233 (print_help): Ditto.
234
235 2003-12-21 Hans de Graaff <hans@degraaff.org>
236
237 * checkbot (performRequest): $url->authority may not be defined
238 for the URL we are checking.
239
240 2003-12-17 Hans de Graaff <hans@degraaff.org>
241
242 * Checkbot 1.74 is released
243
244 * checkbot (add_error): Take into account that status message can
245 be undefined.
246
247 2003-12-15 Hans de Graaff <hans@degraaff.org>
248
249 * checkbot: Put Checkbot errors in a hash to have one set of
250 descriptions around.
251 (handle_doc): Use it.
252 (checkbot_status_message): Use it to ind the status message for a
253 code from HTTP codes, Checkbot codes, or a generic status message.
254 (printServerProblems): Use it.
255 (handle_url): Move checks for --dontwarn and --suppression
256 features from here ...
257 (add_error): ... to here so that it applies to all errors.
258
259 2003-12-14 Hans de Graaff <hans@degraaff.org>
260
261 * checkbot: Document that Checkbot defines its own response codes
262 for common problems.
263 No longer a need for the %warning hash.
264 (add_error): New function to add a new error into the hashes.
265 (handle_url): Use it.
266 (handle_doc): Use it for what previously were warnings.
267 (printServerWarnings): Obsolete as warnings have been changed to
268 use the normal error handling routines.
269 Marked --allow-simple-hosts option as deprecated, because this can
270 now be handled in a more generic way by the --dontwarn mechanism.
271 (print_help): Removed --allow-simple-hosts option from help.
272 (add_to_queue): Move code to check for double slash in URL to ...
273 (handle_doc): ... here as Checkbot error 903.
274
275 2003-11-29 Hans de Graaff <hans@degraaff.org>
276
277 * checkbot (printServerProblems): Oops. Make sure all output is
278 going to the right file, not stdout.
279 Add new --suppress option which reads a file with response code /
280 URL combinations to be suppressed in the output, based on patch by
281 Rob Chekaluk.
282 (init_suppression): Read suppresson file and fill has with
283 results.
284 (handle_url): Use it.
285 (print_help): Document it.
286
287 2003-11-24 Hans de Graaff <hans@degraaff.org>
288
289 * checkbot: Add example to --ignore argument.
290
291 2003-11-23 Hans de Graaff <hans@degraaff.org>
292
293 * checkbot (init_modules): Delete commented-out code to enable
294 HTTP 1.1 in LWP. HTTP 1.1 has been the default in LWP for a while
295 and does not need special code to be enabled.
296
297 2003-11-21 Hans de Graaff <hans@degraaff.org>
298
299 * checkbot (printServerProblems): Don't assume that status_message
300 is defined for all possible codes, based on patch by Thomas
301 Kuerten.
302
303 2003-10-18 Hans de Graaff <hans@degraaff.org>
304
305 * Makefile.PL: Require LWP 5.70 because problems with HEAD of
306 ftp:// links have been solved in this release.
307
308 2003-09-05 Hans de Graaff <hans@degraaff.org>
309
310 * checkbot (printServerProblems): Put line breaks in HTML file in
311 a more logical place.
312
313 2003-08-31 Hans de Graaff <hans@degraaff.org>
314
315 * Checkbot 1.73 released
316
317 2003-08-30 Hans de Graaff <hans@degraaff.org>
318
319 * checkbot (printServerProblems): Protect against undefined status.
320
321 2003-08-29 Hans de Graaff <hans@degraaff.org>
322
323 * checkbot (handle_doc): Ignore URIs matching --ignore as they are
324 being found.
325 (handle_url): Remove check for --ignore option here.
326 Update documentation for --ignore.
327 (print_help): Idem.
328
329 2003-08-21 Hans de Graaff <hans@degraaff.org>
330
331 * checkbot: Made --interval description a bit more clear.
332
333 2003-07-26 Hans de Graaff <hans@degraaff.org>
334
335 * checkbot (init_modules): Uncomment proxy support, but it now
336 applies to all requests, not just external ones.
337 (print_help): Update --proxy help text.
338 Update perldoc documentation.
339
340 2003-07-05 Hans de Graaff <hans@degraaff.org>
341
342 * checkbot: Additional explanation for --exclude option.
343
344 2003-06-28 Bernd Petrovitsch <bernd@firmix.at>
345
346 * checkbot.css: Additional cleaning up of the CSS file.
347
348 2003-06-26 Bernd Petrovitsch <bernd@firmix.at>
349
350 * checkbot: Produce valid XHTML 1.1 pages.
351
352 * checkbot.css: Clean up of the CSS file.
353
354 2003-05-04 Hans de Graaff <hans@degraaff.org>
355
356 * Checkbot 1.72 released
357
358 * checkbot: Applied spelling fixes from Jens Schweikhardt.
359 (clean_up): Factored out of check_links so that it can also be
360 called when we catch a signal.
361 (got_signal): Catch signals like SIGINT and handle them, based on
362 patch by Jens Schweikhardt.
363
364 2003-04-06 Hans de Graaff <hans@degraaff.org>
365
366 * checkbot (handle_url): No longer ignore URLs with a query
367 string. If checking these is not wanted then the --exclude option
368 can be used, and an example for this is now included in the
369 documentation.
370
371 2003-03-30 Hans de Graaff <hans@degraaff.org>
372
373 * checkbot (printServerProblems): Add links to different error
374 codes on a server page for quick navigation.
375
376 2003-02-22 Paul Merchant, Jr. <Paul.L.Merchant.Jr@Dartmouth.EDU>
377
378 * checkbot: Initialize the statistics counters to avoid warnings.
379
380 2003-01-15 Hans de Graaff <hans@degraaff.org>
381
382 * checkbot (output): Correct the check for --verbose; not
383 specifying it now generates no output.
384
385 2003-01-06 Hans de Graaff <hans@degraaff.org>
386
387 * checkbot (handle_doc): The host name check does not make much
388 sense for news: scheme URLs.
389
390 2003-01-03 Hans de Graaff <hans@degraaff.org>
391
392 * checkbot (init_globals): Only remove file from default --match
393 argument when there is a path component in the start URL.
394 Initialize problem counter to avoid warning about uninitialized
395 value.
396
397 2002-12-29 Hans de Graaff <hans@degraaff.org>
398
399 * Checkbot 1.71 released
400
401 * checkbot (handle_url): Make sure we feed is_internal a string.
402 (handle_url): Use existing variable instead of Referer header to
403 store parent URL.
404
405 * Checkbot 1.70 created for testing, but not released
406
407 * checkbot (performRequest): Add HTTP 403 error to list of error
408 codes to retry with a GET.
409 (handle_url): Only follow redirections for internal links.
410
411 2002-12-28 Hans de Graaff <hans@degraaff.org>
412
413 * checkbot: Removed reference to AnyDBM_File because it is not
414 used anywhere.
415 Rewrote global statistics gathering to be more simple and more
416 accurate.
417 Added --filter option which allows rewriting of URLs before they
418 are checked, based on patch from Eli the Bearded <eli@netusa.net>.
419 Simplified storage of URLs with problems
420 (get_headers): Removed.
421 (performRequest): Included code from get_headers here.
422 (count_problems): Updated for new storage of URLs
423 (printServerProblems): Idem.
424 (handle_url): Idem.
425 (handle_doc): Idem.
426 (count_problems): Idem.
427 (printServerProblems): Idem.
428 (handle_doc): Add code to report all pages on which a problematic
429 URL appears.
430 (init_globals): Changed default --match argument to exclude final
431 page name.
432
433
434 2002-12-27 Hans de Graaff <hans@degraaff.org>
435
436 * checkbot (output): Moved printing, including indentation and
437 verbose checking, to function 'output'.
438 (handle_doc): No more distinction between internal and external
439 links, we throw all links found in the queue.
440 (handle_doc): Removed statistics for now, they are too buggy.
441 (is_checked): New function takes into account that we sometimes
442 translate hostnames to IP addresses.
443 (handle_doc): Use it.
444 (check_internal): Removed dependency on statistics, use actual
445 queue contents to determine when all links are checked.
446 (handle_url): Only query server for file type on
447 application/octet-stream documents.
448 (is_internal): New function to determine if URL is internal.
449 (handle_url): Rewritten to use new functions and to deal with
450 external URLs being mixed in, and generally cleaned up.
451 (handle_url): Moved --internal-only checks here.
452 (check_external): Removed.
453 (check_links): Renamed from check_internal.
454 Added small blurb to documentation on distinction between internal
455 and external links and the way checkbot checks these.
456
457 * t/test.t: Added simple test case: can checkbot be run without
458 arguments?
459
460 2002-12-25 Hans de Graaff <hans@degraaff.org>
461
462 * Checkbot 1.69 released
463
464 2002-12-25 Hans de Graaff <hans@degraaff.org>
465
466 * checkbot (get_headers): Make sure feedback on HEAD requests gets
467 indented properly.
468
469 2002-12-23 Hans de Graaff <hans@degraaff.org>
470
471 * checkbot (init_globals): Anchor automatic match argument based
472 on start URLs at the beginning.
473
474 2002-12-16 Jens Schweikhardt <schweikh@schweikhardt.net>
475
476 * checkbot (check_external): Fixed printf to be print so that
477 actual information can be printed using --verbose.
478
479 2002-12-02 Hans de Graaff <hans@degraaff.org>
480
481 * checkbot (get_headers): Also add 406 as an error which might
482 indicate that the web server doesn't like us doing a HEAD, so GET
483 instead.
484
485 2002-12-01 Hans de Graaff <hans@degraaff.org>
486
487 * Makefile.PL: Updated based on libwww-perl Makefile.PL.
488
489 * checkbot: Remove the preamble cruft and just assume perl will be
490 /usr/bin/perl. Therefore also renamed checkbot.pl -> checkbot.
491 Indicate that Checkbot is licensed under the same terms as Perl
492 itself.
493
494 * checkbot.pl (count_problems): Rewrote debugging code to handle
495 request without header() method, even though this should not be
496 possible it does happen in the wild.
497 (handle_doc): Perform fully-qualified hostname check for all URI's
498 which support a hostname.
499
500 2002-11-30 Hans de Graaff <hans@degraaff.org>
501
502 * checkbot.pl (add_checked): Use ->can construct to check if URL
503 supports host method.
504
505 2002-10-27 Hans de Graaff <hans@degraaff.org>
506
507 * checkbot.pl: Add hints for recursive or run-away checkbot
508 processes.
509
510 2002-09-28 Hans de Graaff <hans@degraaff.org>
511
512 * Checkbot 1.68 released
513
514 2002-08-05 Hans de Graaff <hans@degraaff.org>
515
516 * checkbot.pl (handle_doc): Comment out warning about external
517 URLs with non-checkable schemes to avoid lots of useless output.
518
519 2002-06-09 Jostle Lemcke <jostle@users.sourceforge.net>
520
521 * checkbot.pl: Added --allow-simple-hosts option. This option
522 turns off the warnings for unqualified host names.
523
524 2002-04-01 Hans de Graaff <hans@degraaff.org>
525
526 * checkbot.pl (handle_doc): Ignore URLs found in <base>
527 tags. Suggestion from Roman Maeder.
528
529 2002-03-31 Hans de Graaff <hans@degraaff.org>
530
531 * checkbot.pl (print_help): Mention --style option in help message.
532 (check_internal): Always close CURRENT filehandle, and add warn
533 for potential problems with this based on patch and report from
534 Greg Larkin.
535
536 * checkbot.pl: Added HINTS AND TIPS section to
537 documentation. Added hint on using passive FTP based on feedback
538 from Roman Maeder.
539
540 2002-03-31 Brent Verner <brent@rcfile.org>
541
542 * checkbot.pl (handle_doc): Only match http and https, not stuff
543 like httpa.
544
545 2002-03-31 Paco Hope <paco@paco.to>
546
547 * checkbot.css: Contributed style sheet for Checkbot. Use with
548 --style option.
549
550 2002-01-20 Roman Maeder <maeder@mathconsult.ch>
551
552 * checkbot.pl (handle_url): Use select() to sleep instead of
553 sleep() so that sleep interval can be fractional.
554
555 2001-12-16 Hans de Graaff <hans@degraaff.org>
556
557 * Checkbot 1.67 released
558
559 2001-11-16 Hans de Graaff <hans@degraaff.org>
560
561 * checkbot.pl: Add example for --match argument based on question
562 by Michael Lambert.
563
564 2001-11-11 Hans de Graaff <hans@degraaff.org>
565
566 * checkbot.pl (count_problems): Quote meta characters in server
567 name and URL when matching them.
568 (handle_doc): Fix two minor bugs related to the move to URI.
569
570 2001-11-11 Evaldas Imbrasas <evaldas@wolfram.com>
571
572 * checkbot.pl: Add --language option to allow language
573 negotiation.
574
575 * checkbot.pl (check_options): Set default for --sleep option to 0.
576
577 * checkbot.pl (check_internal): Only close <CURRENT> if it already
578 exists.
579
580 2001-11-03 Hans de Graaff <hans@degraaff.org>
581
582 * checkbot.pl (printServerProblems): There might not be a response
583 message.
584 (handle_url): Use status_line instead of code and message for
585 HTTP::Response object.
586 (handle_doc): Also check external gopher links.
587
588 2001-10-25 Hans de Graaff <hans@degraaff.org>
589
590 * Checkbot 1.66 released
591
592 * checkbot.pl (get_headers): URI doesn't know about netloc, but it
593 does know about authority.
594 (get_headers): $url is already absolute, no need for ->abs
595
596 2001-10-18 Hans de Graaff <hans@degraaff.org>
597
598 * Checkbot 1.65 released
599
600 2001-10-14 Hans de Graaff <hans@degraaff.org>
601
602 * checkbot.pl (handle_doc): Print a notice when external non
603 HTTP/FTP URLs are dropped.
604
605 2001-09-29 Hans de Graaff <hans@degraaff.org>
606
607 * checkbot.pl (init_modules and other places): Remove
608 URI::URL::strict call and use of new URI::URL because it is
609 obsolete, we should use the URI classes now.
610
611 2001-09-23 Hans de Graaff <hans@degraaff.org>
612
613 * checkbot.pl (init_globals): Initialize last checkpoint time with
614 0 instead of current time, so that we write out a set of pages
615 right at the start. This will catch problems with permissions for
616 these pages as early as possible.
617
618 2001-07-01 Hans de Graaff <hans@degraaff.org>
619
620 * checkbot.pl (get_server_type): Take into account that we might
621 not learn anything about the server
622
623 2001-05-06 Hans de Graaff <hans@degraaff.org>
624
625 * checkbot.pl (get_headers): Factored out of check_external so
626 that moving to using GET requests only will be easier later.
627
628 2001-04-30 Hans de Graaff <hans@degraaff.org>
629
630 * checkbot.pl (send_mail): Really fix printing of starting URLs in
631 email. All URLs are now printed in the subject and body of the
632 message.
633
634 2001-04-15 Hans de Graaff <hans@degraaff.org>
635
636 * Checkbot 1.64 released
637
638 2001-03-13 Hans de Graaff <hans@degraaff.org>
639
640 * checkbot.pl (send_mail): Fix printing of starting URL in email.
641
642 2001-03-04 Nick Hibma <n_hibma@qubesoft.com>
643
644 * checkbot.pl (printServerWarnings): Removed duplicate print statement.
645
646 2001-02-10 Boris Lantrewitz <lantrewi@do.isst.fhg.de>
647
648 * checkbot.pl (init_globals): Allow more environment variables to
649 be used to set the temporary directory.
650 (send_mail): Avoid using printf to the handle for those systems
651 where printf on a pipe is not implemented.
652
653 2001-01-14 Hans de Graaff <hans@degraaff.org>
654
655 * Checkbot 1.63 released
656
657 2001-01-02 Hans de Graaff <hans@degraaff.org>
658
659 * Makefile.PL (chk_version): Require LWP 5.50, which contains an
660 important bugfix when dealing with relative redirects.
661
662 2001-01-01 Hans de Graaff <hans@degraaff.org>
663
664 * checkbot.pl (init_globals): If no --match is given, construct
665 one based on all the start URLs given. Suggested by Mathieu
666 Guillaume.
667
668 2000-12-31 Hans de Graaff <hans@degraaff.org>
669
670 * checkbot.pl (create_page): Remove the .bak file when the new
671 file is written, unless --debug is in effect.
672
673 2000-12-31 OBARA Kiyotake <obara@vc-net.ne.jp>
674
675 * checkbot.pl (print_server): Create correct URLs when --file
676 argument contains directories as well as a filename.
677
678 2000-12-31 David Brownlee <abs@purplei.com>
679
680 * checkbot.pl (create_page): Fix typo in die message.
681
682 2000-12-24 Hans de Graaff <hans@degraaff.org>
683
684 * checkbot.pl: Added a small blurb in the documentation about the
685 URLs Checkbot will find and check.
686
687 2000-12-24 Petter Reinholdtsen <pere@hungry.com>
688
689 * checkbot.pl (handle_url): Deal with redirect responses without
690 Location header.
691
692 2000-11-18 Roman Maeder <maeder@mathconsult.ch>
693
694 * checkbot.pl (handle_url): Remove check which would not check
695 files named the same as the main report file. If you don't want
696 Checkbot to check its intermediate pages, use the --exclude
697 option.
698
699 * checkbot.pl (handle_url): Ask server for file type when
700 requesting http:// URLs to be on the safe side, as using
701 guess_media_type() is not always correct.
702
703 2000-10-28 Nick Hibma <n_hibma@qubesoft.com>
704
705 * checkbot.pl (check_external): Only print when --verbose is true.
706 (printServerProblems): Fix proper printing of <hr>.
707 (handle_doc): Include proper URL for report for unqualified URLs.
708
709 2000-10-01 TAKAKU Masao <masao@ulis.ac.jp>
710
711 * checkbot.pl (print_server): Make pages well-formed by inserting
712 <html> and <body> tags.
713
714 2000-09-24 Hans de Graaff <hans@degraaff.org>
715
716 * Checkbot 1.62 released
717
718 2000-09-16 Hans de Graaff <hans@degraaff.org>
719
720 * checkbot.pl (send_mail): Only mention URL in the subject of the
721 mail if one is given through the --url option.
722 (check_external): The ALEPH web server is also broken with respect
723 to HEAD requests.
724
725 2000-09-04 Hans de Graaff <hans@degraaff.org>
726
727 * checkbot.pl (check_external): JavaWebServer is also broken with
728 respect to HEAD requests.
729
730 2000-08-26 Hans de Graaff <hans@degraaff.org>
731
732 * checkbot.pl (create_page): Add --style option which allows a
733 link to a CSS file to be included in each Checkbot page.
734
735 2000-08-20 Nick Hibma <n_hibma@qubesoft.com>
736
737 * checkbot.pl (check_external): Some servers don't set the Server:
738 header. Check to see if the server field is set in a response to
739 avoid warnings.
740
741 * checkbot.pl (add_checked): Add --enable-virtual option to use
742 hostname instead of IP address to distinguish servers. This allows
743 checking of multiple virtual servers.
744
745 2000-08-13 Hans de Graaff <hans@degraaff.org>
746
747 * Makefile.PL: Add a check for HTML::Parser. Require latest
748 version, 3.10, because I'm not sure older versions work correctly.
749
750 2000-06-29 Hans de Graaff <hans@degraaff.org>
751
752 * Checkbot 1.61 released
753
754 * Makefile.PL (chk_version): Add version checked for in output.
755
756 2000-06-18 Larry Gilbert <larry@n2h2.com>
757
758 * checkbot.pl (check_external): Use GET instead of HEAD for
759 confused closed-source servers.
760
761 2000-06-18 Hans de Graaff <hans@degraaff.org>
762
763 * Makefile.PL (chk_version): require URI 1.07 as it contains bug
764 fixes for using Base URLs.
765
766 * checkbot.pl: Change email and web address
767
768 2000-04-30 Hans de Graaff <graaff@xs4all.nl>
769
770 * Checkbot 1.60 released
771
772 * checkbot.pl (check_options): Add option --dontwarn to exclude
773 certain types of warnings. Based on idea by David Hoekman.
774
775 2000-04-29 Mark Roedel <roedelm@letu.edu>
776
777 * checkbot.pl (handle_url): Deal with "300 Multiple Choices"
778 response which does not offer a URL to redirect to.
779
780 2000-04-09 David Hoekman <dhoekman@halcyon.com>
781
782 * checkbot.pl (init_globals): Allow for TMPDIR with or without
783 trailing /
784
785 2000-04-08 Hans de Graaff <Hans de Graaff <graaff@xs4all.nl>>
786
787 * checkbot.pl: Updated contact information in file header.
788
789 2000-03-26 Hans de Graaff <graaff@xs4all.nl>
790
791 * checkbot.pl (check_options): Add message about skipping of
792 external links. Also removes warning about single use of variable.
793
794 2000-03-06 Brian McNett <webmaster@mycoinfo.com>
795
796 * checkbot.pl: On a Mac, ask command line options
797 through MacPerl mechanism.
798
799 2000-02-06 Hans de Graaff <graaff@xs4all.nl>
800
801 * checkbot.pl (init_globals): Check wether URLs on the command
802 line have a proper host. Thanks to Charles Williams for the
803 report.
804
805 2000-01-30 Hans de Graaff <graaff@xs4all.nl>
806
807 * Checkbot 1.59 released
808
809 * checkbot.pl (handle_doc): Use eof instead of parse(undef) to end
810 parsing.
811
812 2000-01-15 Hans de Graaff <graaff@xs4all.nl>
813
814 * checkbot.pl (handle_doc): Show warnings about hostnames only on
815 the console when --verbose.
816
817 2000-01-09 Hans de Graaff <graaff@xs4all.nl>
818
819 * checkbot.pl: Added option --internal-only to skip checking of
820 external links altogether. Idea by David Hoekman
821 <dhoekman@halcyon.com>
822
823 2000-01-02 Hans de Graaff <graaff@xs4all.nl>
824
825 * checkbot.pl (handle_doc): Use canonical URI from LinkExtor,
826 which simplifies the rest of the logic and gets things working
827 with the new version of LinkExtor.
828
829 2000-01-01 Stephane Bortzmeyer <bortzmeyer@pasteur.fr>
830
831 * checkbot.pl (init_globals): Create Checkbot workdir in $TMPDIR
832 if defined, /tmp otherwise.
833
834 1999-12-31 Hans de Graaff <graaff@xs4all.nl>
835
836 * checkbot.pl (handle_doc): Change frag to fragment.
837
838 1999-11-07 Hans de Graaff <graaff@xs4all.nl>
839
840 * checkbot.pl (handle_doc): Add warning for URLs for which LWP
841 can't determine a hostname, and don't check them further.
842
843 1999-10-24 Hans de Graaff <graaff@xs4all.nl>
844
845 * checkbot.pl (print_help): Added line on --interval option.
846
847 1999-10-23 Hans de Graaff <graaff@xs4all.nl>
848
849 * checkbot.pl (init_globals): Fixed proper determination of server
850 prefix if a filename is supplied, thanks to Michael Baumer.
851
852 1999-10-02 Hans de Graaff <graaff@xs4all.nl>
853
854 * checkbot.pl (init_modules): Added use URI.
855
856 1999-08-21 Hans de Graaff <graaff@xs4all.nl>
857
858 * Makefile.PL (chk_version): Added check for URI.
859
860 1999-07-17 Hans de Graaff <graaff@xs4all.nl>
861
862 * README: Added blurb on the announcements mailing list.
863
864 1999-07-06 Hans de Graaff <graaff@xs4all.nl>
865
866 * checkbot.pl (add_checked): Deal with the fact that a mailto: URL
867 has no host component. Thanks to John Croft for the report.
868
869 1999-06-27 Hans de Graaff <graaff@xs4all.nl>
870
871 * checkbot.pl (handle_url): Really fix relative redirection URLs
872 using the URI class. Thanks for Thomas Zander for the report and
873 reproducible failing URL.
874
875 1999-05-03 Hans de Graaff <graaff@xs4all.nl>
876
877 * checkbot.pl (printServerWarnings): Also change clustering of URLs.
878
879 1999-05-02 Hans de Graaff <graaff@xs4all.nl>
880
881 * checkbot.pl (signature): Add quotes around the URL in the
882 signature.
883 (printServerProblems): Fixed clustering of URLs so that faulty
884 links are listed under the URL that contains them, instead of the
885 other way around. This ordering problem was introduced in 1.53.
886
887 1999-04-10 Hans de Graaff <graaff@xs4all.nl>
888
889 * checkbot.pl (handle_url): Make sure a redirected URL is fully
890 qualified (based on the original URL) to avoid dying on it
891 later. Thanks to David Hoekman for the initial analysis.
892
893 1999-04-05 Hans de Graaff <graaff@xs4all.nl>
894
895 * checkbot.pl (printAllServers): Taken out of create_page for
896 clarity.
897 (printServerWarnings): Keep warning headers from being printed for
898 each warning.
899
900 1999-03-15 Hans de Graaff <graaff@xs4all.nl>
901
902 * README: Explain which Perl modules are needed.
903
904 1999-02-20 Hans de Graaff <graaff@xs4all.nl>
905
906 * checkbot.pl (printServerWarnings): Fix printing of warnings so
907 that headers are only printed once.
908 (print_server): get correct IP address for web servers with
909 non-standard port numbers.
910
911 1999-02-08 Hans de Graaff <graaff@xs4all.nl>
912
913 * Makefile.PL (chk_version): Added location of Mail::Send.
914
915 1999-01-18 Hans de Graaff <graaff@xs4all.nl>
916
917 * checkbot.pl (count_problems): Change counting of problems to
918 deal with new structure.
919
920 1999-01-17 Hans de Graaff <graaff@xs4all.nl>
921
922 * checkbot.pl (printServerProblems): Changed to accomodate new
923 inventory of problem response. This new method allow multiple bad
924 links to one URL be all reported all at once. Also use
925 standardized response descriptions based on a patch by Benjamin
926 Franz <snowhare@nihongo.org>.
927
928 1999-01-10 Hans de Graaff <graaff@xs4all.nl>
929
930 * checkbot.pl (byReferringPage): Added to allow sorting of
931 problems by referer.
932 (byProblem): Removed code to compare by exact message and
933 referer.
934 Removed the pre-amble to generate correct perl path because it is
935 a bit too cumbersome during development.
936
937 1998-12-31 Hans de Graaff <graaff@xs4all.nl>
938
939 * checkbot.pl (handle_url): Do a HEAD request when the guessed
940 content-type matches application/octet-stream to get the real
941 content-type from the server.
942
943 1998-12-27 Hans de Graaff <graaff@xs4all.nl>
944
945 * checkbot.pl (handle_doc): Added warning for HTTP URLs without a
946 fully-qualified hostname.
947
948 * checkbot.pl (printServerWarnings): Added a mechanism to also
949 display checkbot warnings, unrelated to the HTTP responses, on the
950 results pages.
951
952 1998-10-24 Hans de Graaff <graaff@xs4all.nl>
953
954 * checkbot.pl (setup): Explicitly set record separator $/
955 This appears needed for perl 5.005, and fixes a problem
956 where no URLs would appear to match except the first few.
957
958 1998-10-10 Hans de Graaff <graaff@xs4all.nl>
959
960 * checkbot.pl: Made POD conform to new scripts format better.
961
962 1998-06-21 Hans de Graaff <graaff@xs4all.nl>
963
964 * checkbot.pl (init_modules): HTML::Parse is no longer needed,
965 removed.
966
967 Sat Sep 6 16:00:12 1997 Hans de Graaff <graaff@xs4all.nl>
968
969 * checkbot 1.51 released
970
971 Sat Aug 30 18:05:39 1997 Hans de Graaff <graaff@xs4all.nl>
972
973 * checkbot.pl (init_globals): assume file: scheme when no scheme
974 is present.
975
976 * checkbot.pl: Small portability stuff for perl 5.004 and LWP 5.11.
977
978 Sun Aug 17 08:56:38 1997 Hans de Graaff <graaff@xs4all.nl>
979
980 * README: Changed email addresses to point to new ISP.
981
982 Mon Apr 28 09:08:29 1997 Hans de Graaff <graaff@xs4all.nl>
983
984 * checkbot.pl: Parsing VERSION is somewhat tricky. Fixed.
985
986 Sun Apr 27 21:02:58 1997 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
987
988 * checkbot.pl (check_external): Close EXTERNAL after use.
989
990 Sun Apr 20 10:24:09 1997 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
991
992 * checkbot.pl: Fixed a number of small bugs reported by Jost Krieger.
993 Regular expressions can now be used with the options.
994 Added --interval option to denote maximum interval between updates.
995
996 Sat Apr 5 17:03:46 1997 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
997
998 * checkbot.pl (init_globals): Added checks for URLs without a scheme.
999
1000 Fri Mar 14 11:17:21 1997 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1001
1002 * checkbot.pl (print_help): Fix typo.
1003
1004 Tue Jan 14 16:51:36 1997 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1005
1006 * checkbot.pl (check_internal): Check whether there are really
1007 entries in the new queue when changing queues.
1008
1009 Sat Jan 4 14:26:04 1997 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1010
1011 * checkbot.pl (print_help): --seconds should be --sleep in help.
1012
1013 Mon Dec 30 12:03:14 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1014
1015 * checkbot.pl (handle_url): If a URL is exclude'd, only use HEAD
1016 on it, not GET.
1017 Starting URLs can now be entered on the command line in addition
1018 to the --url option. --url takes precedence. --match is
1019 initialized with first URL if not given as separate option.
1020
1021 Mon Dec 23 20:21:32 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1022
1023 * checkbot.pl (print_server_problems): Each error message was
1024 evaluated as a regexp, potentially crashing checkbot on a bad
1025 regexp (e.g. including the string '++').
1026
1027 Mon Dec 23 15:15:05 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1028
1029 * checkbot.pl (ip_address): Deal with IP-address not found.
1030
1031 Sun Dec 8 12:55:33 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1032
1033 * checkbot.pl (send_mail): --note didn't work; Checkbot would
1034 crash when no external links were found.
1035
1036 Wed Dec 4 12:43:14 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1037
1038 * checkbot.pl (add_checked): All checked URLs are indexed using
1039 IP-address to avoid checking pages multiple times for multiple
1040 CNAME's.
1041
1042 Mon Nov 4 14:19:30 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1043
1044 * checkbot.pl (send_mail): Braino in URL fixed.
1045
1046 Sun Oct 27 20:16:38 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1047
1048 * checkbot.pl (init_globals): Don't let --match default to the
1049 --url until after we possible change the URL (this happens for
1050 file:/ URLs, currently)
1051
1052 Wed Oct 23 14:22:15 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1053
1054 * checkbot.pl (check_point): Oops, checking would occur every minute
1055
1056 Mon Oct 21 13:41:48 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1057
1058 * checkbot.pl (print_help): Added version number to help info.
1059
1060 Wed Oct 16 21:05:58 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1061
1062 * checkbot.pl: Added --proxy option for checking external links
1063 through a proxy server
1064
1065 Sat Sep 28 09:26:48 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1066
1067 * checkbot.pl (init_globals): Changed /var/tmp to /tmp.
1068 (check_point): Slower exponential rate, upper limit of 3 hours
1069
1070 * Makefile.PL: Added check for Mail::Send
1071
1072 * README: Added
1073
1074 Thu Sep 26 17:01:36 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1075
1076 * checkbot.pl: Switched from short options to long options.
1077 I was already running out of meaningful options, so before adding
1078 additional stuff I wanted to move to Long options first. You
1079 should be able to abbreviate most options to the previous values.
1080 Notable exception is -m, which has become --match.
1081
1082 Wed Sep 25 10:58:06 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1083
1084 * checkbot.pl:
1085 Renamed from checkbot
1086 Added preamble to set proper path for perl (code from Gisle Aas)
1087
1088 * Makefile.PL: First version, installs checkbot and checkbot.1
1089
1090 * checkbot: Changed $revision to $VERSION for MakeMaker.
1091
1092 Thu Sep 12 15:09:07 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1093
1094 * index.html: updated required modules and location.
1095
1096 * checkbot: require LWP-5.02, because it fixes a few nasty bugs.
1097
1098 Thu Sep 5 16:00:42 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1099
1100 * index.html:
1101 Removed old and out-of-date documentation. Replaced by link to
1102 automatically generated html version of POD documentation
1103 within Checkbot.
1104
1105 * checkbot:
1106 Fixed documentation bugs.
1107 Really fix the case insensitive comparison.
1108
1109 Sun Sep 1 20:31:46 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1110
1111 * checkbot (print_server_problems):
1112 Make comparison for error message case insensitive.
1113
1114 Fri Aug 30 20:19:56 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1115
1116 * checkbot: Fixed several typo's.
1117
1118 Wed Aug 7 10:06:29 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1119
1120 * checkbot (handle_doc):
1121 The new LinkExtractor is nice, but I shouldn't treat its output as
1122 a hash when it is an array, and thus skipping every other link.
1123
1124 Mon Aug 5 08:46:24 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1125
1126 * checkbot (print_server):
1127 Fixed silly bug in calculating the percentage of problems on each
1128 server.
1129
1130 Fri Aug 2 21:38:39 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1131
1132 * checkbot: Added several patches by Bruce Speyer:
1133 Added -N note option to go along with -M, -z to suppress reporting
1134 errors on matching links.
1135 Added enough logic to catch gopher URLS if no gopher server found.
1136 Need further logic to parse gopher returned menu for bad file or
1137 directory.
1138
1139 * checkbot: Made a good start with POD documentation inside the
1140 checkbot file. Try 'perldoc checkbot'.
1141
1142 * TODO: Added number of suggestions by Luuk de Boer.
1143
1144 * checkbot (send_mail): Include summary of links checked in message.
1145
1146 Fri Aug 2 13:01:02 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1147
1148 * checkbot:
1149 Added check for correct LWP version. We now need 5.01, due to bugs
1150 in the handling of the BASE attribute in previous versions.
1151
1152 Sat Jul 27 21:13:26 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1153
1154 * checkbot:
1155 Added several patches by Bruce Speyer:
1156 Optimized some static regular expressions.
1157 Fixed not setting the timeout, making the -t option useless.
1158
1159 Mon Jul 22 22:28:34 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1160
1161 * checkbot (create_page):
1162 Fixed number of columns in summary output.
1163
1164 Sat Jul 20 11:49:23 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1165
1166 * checkbot (handle_doc): Changed to use the new HTML::LinkExtor,
1167 which will be present in LWP5.01. Should be more efficient, and
1168 less prone to memory leaks.
1169
1170 Sat Jul 13 12:41:23 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1171
1172 * checkbot (create_page): Forgot to add the ratio on the page.
1173 (check_external): Fix problems with different `wc` output.
1174
1175 Sat Jun 22 11:30:12 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1176
1177 * checkbot: Use correct base URL as returned with the document.
1178 Only check document when we used 'GET' to receive it.
1179 Remove magic guessing with ending slash of starting url.
1180 Deal with redirections by inserting redirected URLs into queue
1181 again.
1182
1183 Thu Jun 20 15:58:20 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1184
1185 * checkbot: Major cleanup of initialization code. Also added todo
1186 counts to progression page, and proper todo handling for external
1187 links.
1188
1189 Sun Jun 16 21:16:28 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1190
1191 * checkbot: Added -M option: send mail when Checkbot is done.
1192 Fixed division by zero bug when external links == 0
1193
1194 Tue Jun 4 12:46:39 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1195
1196 * checkbot: Better way to ignore fragments.
1197
1198 Sat Jun 1 15:14:52 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1199
1200 * checkbot: Don't print decimals with the precentages.
1201 Major update of counting, and printing counts. Cleaned up
1202 variables, corrected counting, made display more consistent and
1203 clear.
1204
1205 Wed May 29 21:18:26 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1206
1207 * checkbot: Small fixes to support lwp-win32 as well, thanks to
1208 Martin Cleaver.
1209
1210 Mon May 27 09:21:30 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1211
1212 * checkbot: oops, small error in regexp caused script to append a
1213 slash to almost all start-url's. Fixed.
1214
1215 * checkbot (handle_doc): External links without full URL's were
1216 not always handled properly.
1217
1218 Sun May 26 10:04:39 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1219
1220 * checkbot: If the starting URL doesn't end in a slash, and
1221 doesn't have an extension, assume we need to add a slash.
1222
1223 * index.html: Add version number to web page, and make sure it gets
1224 updated automatically.
1225
1226 Wed May 22 09:58:36 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1227
1228 * checkbot: Changed verbose output of links found on pages.
1229
1230 Tue May 14 16:43:38 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1231
1232 * TODO: updated with respect to recent changes.
1233
1234 Mon May 13 15:06:05 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1235
1236 * checkbot: Added LWP version number to agent field, changed page
1237 update policy, and updated script to LWP5b13.
1238
1239 Sat May 4 21:38:56 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1240
1241 * checkbot: Changed checked array to an associative array. Will
1242 consume more memory, but drastically cut back on lookup time.
1243
1244 Rewrote handle_url logic to be more clear. Also fixed bug where
1245 servers would be added to the list unjustly.
1246
1247 Sleep was only done on problem links, not after each request.
1248
1249 Also added checks for already checked links while scanning through
1250 the document, and only add those links not checked to the queue.
1251
1252 Add percentage problem links for each individual server.
1253
1254 Mon Apr 29 08:43:12 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1255
1256 * checkbot: Deal with unknown or non-determinable server types.
1257
1258 Only add links to the external queue when we know we can check
1259 their protocol.
1260
1261 Additional changes to layout and content of pages.
1262
1263 Sun Apr 28 21:16:51 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1264
1265 * checkbot: Rewrote report page.
1266
1267 Wed Apr 24 22:39:43 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1268
1269 * checkbot: Added a number of patches by Tim MacKenzie
1270 Added -s option to set the seconds of sleep between requests.
1271 Remove work files when *not* debugging.
1272 Only compile -m and -x regular expressions once.
1273 Also check external ftp and nntp links (using HEAD only).
1274 Get rid of huge memory leak! (Also noted by Fabrice Gaillard)
1275
1276 Fri Mar 29 10:58:24 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1277
1278 * checkbot:
1279 Got rid of warnings about some variables.
1280 Fixed problem with incorrect automatic -m argument when scanning
1281 local files.
1282
1283 Sun Mar 24 18:01:05 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1284
1285 * checkbot:
1286 Added code to support regular expressions with the -m and -x
1287 arguments. Thanks to Thomas Thiel for the patch and suggestions.
1288
1289 No strict checking on schemes, fixes problem with unknown schemes
1290 stopping checkbot. Thanks to Pierre-Yves Foucou.
1291
1292 * checkbot:
1293 Should create direcory for temporary files, and remove it
1294 afterwards. Noted by Steve Fisk.
1295
1296 Sat Mar 16 13:40:48 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1297
1298 * checkbot:
1299 Made a number of changes from or based on patches by Thomas Thiel:
1300
1301 Added missing t option in Getopts string.
1302
1303 Made -m argument optional. If not given, the -u argument is also
1304 used as the start argument.
1305
1306 Temporary files are now created in a separate directory. Its name
1307 contains the PID of Checkbot, to allow several concurrent
1308 Checkbots being run. Also remove temporary files, unless
1309 debugging.
1310
1311 Implement file:// scheme to allow direct checking (without HTTP
1312 server)
1313
1314 Fri Mar 15 11:06:13 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1315
1316 * checkbot:
1317 Fixed warnings (and in the process, a small bug as well).
1318 Added URL and proper name to help.
1319
1320 Sat Mar 2 11:51:45 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1321
1322 * checkbot:
1323 Added 'require 5.002' (because libwww-perl5b8 requires it).
1324 Added 'use strict', and fixed problems resulting from this. This
1325 can be seen as a first step towards fixing the huge
1326 memory-consumption.
1327 Updated help.
1328
1329 Tue Feb 27 09:57:57 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1330
1331 * checkbot:
1332 Fixed bug which occured when -x option was not present.
1333 Updated script to use libwww-perl5b8 function names. This is not
1334 backward compatible with versions prior to beta 8.
1335
1336 Mon Feb 26 12:46:08 1996 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1337
1338 * checkbot:
1339 Fixed bug with Referer header for external URL's.
1340 Also make server pages auto-refresh.
1341
1342 Sat Feb 24 11:48:15 1996 Hans de Graaff <Hans.deGraaff@twi72.twi.tudelft.nl>
1343
1344 * TODO: New file.
1345
1346 * checkbot: Added single -x option as an additional exclude pattern.
1347 This overrules the -m match attribute.
1348
1349 Mon Dec 11 14:13:30 1995 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1350
1351 * index.html
1352 Added libwww-perl5 address, and added a usage section.
1353
1354 * checkbot.pl
1355 Removed this old perl4 version.
1356
1357 Fri Dec 8 13:41:43 1995 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1358
1359 * checkbot:
1360 Major rewrite of most of the internal routines. The routines are
1361 much more structured now, and broken up into smaller routines.
1362 I also changed the way checked links are remembered. It should be
1363 much less efficient, CPU-wise, but more efficient memory-wise.
1364
1365 Fri Nov 24 16:45:18 1995 Hans de Graaff <J.J.deGraaff@twi.tudelft.nl>
1366
1367 * checkbot:
1368 Fixed small problems, mostly with output.
1369 Fixed checking of external links
1370 Changed sorting order
1371
1372 * checkbot:
1373 Perl5 version now works for the most part. Although Checkbot isn't
1374 fully finished I at least feel confident to release it.
1375
1376 Fri Aug 25 11:23:36 1995 Hans de Graaff <graaff@is.twi.tudelft.nl>
1377
1378 * Made a start with the perl5 version of checkbot. The modules in
1379 perl5 (e.g. LWP) look very promising, and should make checkbot
1380 quite a bit better.