htmlrecode  1.3.1
About: htmlrecode recodes a HTML file using a new character set.
  Fossies Dox: htmlrecode-1.3.1.tar.gz  ("inofficial" and yet experimental doxygen-generated source code documentation)  

progdesc.php
Go to the documentation of this file.
1 <?php
2 //TITLE=HTML file character encoding changer
3 
4 $outset= 'utf-8';
5 $title = 'HTML file Character set converter';
6 $progname = 'htmlrecode';
7 
8 $text = array(
9  '1. Purpose' => "
10 
11 Recodes the HTML file using a new character set, while losing
12 no characters at all. You can recode shift_jis to euc-jp, utf8 to latin1,
13 iso-8859-15 to GB18030, iso-2022-jp to koi8-r etc if you wish, and none
14 of the characters on the page will become unreadable
15 (unless you specify -l switch, which disables making &amp;#nnnn; escapes).
16 <p>
17 Standard-correct HTML is a good thing.
18 One of the goals in the development of this program is
19 that it never makes the HTML more broken than it previously was.
20 It should even make it better than it was. So if you
21 see that the program does the opposite, please <a href=\"#contact\">tell me</a>.
22 
23 ", '1. Usage' => "
24 
25 <pre class=smallerpre>htmlrecode 1.2.0 - Copyright (C) 1992,2003 Bisqwit (http://iki.fi/bisqwit/)
26 
27 Usage: htmlrecode [&lt;option> [&lt;...>]]
28 
29 Reads stdin, writes stdout.
30 
31 Options:
32  -I, --inset setname Assumed input character set (default: iso-8859-1)
33  -O, --outset setname Wanted output character set (default: iso-8859-1)
34  -V, --version Displays version information.
35  -e, --usehex Use hexadecimal escapes.
36  -g, --signature Prefix the file with an unicode signature.
37  -h, --help This help.
38  -l, --lossy Disable lossless conversion.
39  -q, --quiet Be less verbose.
40  -s, --strict Turn off support for slightly broken HTML.
41  -v, --verbose Be less quiet.
42  -x, --xmlmode XML mode: all tag param values quoted.
43 
44 Pipe in the html file and pipe the output to result file.</pre>
45 
46 ", '1. TODO' => "
47 
48 I'll soon add an interface for modifying the text content of a HTML file.<br>
49 This should make making filters like Pootpoot or Pikachifier easier. It is
50 already theoretically supported, but I haven't invented an interface for
51 it yet.
52 
53 ", '1. Installation' => "
54 
55 <pre class=smallerpre
56 >\$ make
57 \$ su
58 # make install</pre>
59 
60 If you do not want to install
61 <a href=\"http://oktober.stc.cx/source/libargh.html\">libargh</a>
62 (included in the archive), do not use \"make install\" and edit
63 Makefile and enable the STATIC linking instead of DYNAMIC.
64 
65 ", '1. Example' => "
66 
67 This page template is locally stored in iso-8859-1, but is
68 automatically converted to utf-8 to make the final version.<p>
69 Here are some latin letters:
70 Here are some CJK (chinese/japanese/korean ideograms): &#26085;&#26412;<br>
71 Here are some html escapes: &gt;&quot;&auml;&ouml;&ecirc;<br>
72 <p>
73 Source code of the above:<pre class=smallerpre
74 >Here are some latin letters: &aring;&auml;&ouml;&ntilde;&eacute;&lt;br&gt;
75 Here are some CJK (chinese/japanese/korean ideograms): &amp;#26085;&amp;#26412;&lt;br&gt;
76 Here are some html escapes: &amp;gt;&amp;quot;&amp;auml;&amp;ouml;&amp;ecirc;&lt;br&gt;
77 </pre>
78 What your browser is getting, is not &amp;#26085; etc but the actual utf-8 characters.
79 
80 ", 'contact:1. Feedback' => "
81 
82 If you have problems using this program or ideas how to
83 develop it, email me your questions or ideas.<br>
84 Please do not omit the details.<br>
85 My email address (sigh) is: <em>bisqwit a<b style=\"font-weight:lighter\">t i</b>ki <small>dot</small> fi</em>
86 
87 ", '1. Requirements' => "
88 
89 htmlrecode has been written in C++, utilizing the standard template library.<br>
90 GNU make is required.<br>
91 I have g++ version 3.3, and htmlrecode compiles without warnings. For now.
92 
93 ", '1.1. Compilation problems' => "
94 
95 htmlrecode uses widestrings, which is a feature different g++ versions
96 are very inconsistent about. <code>htmlrecode.hh</code> has some settings you can
97 try to choose between. Try this:<p>
98 Replace<pre>
99  //#define wstring ucs4string
100  typedef wchar_t ucs4;
101  //typedef unsigned int ucs4;
102  //typedef basic_string&lt;ucs4> wstring;</pre>
103 With<pre>
104  //#define wstring ucs4string
105  //typedef wchar_t ucs4;
106  typedef unsigned int ucs4;
107  typedef basic_string&lt;ucs4> wstring;</pre>
108 </p>
109 This might help compiling on g++-2.95.
110 
111 ", '1. Changelog' => "
112 
113 <pre>
114 Since 1.3.0:
115  - Compilation fixes on more up-to-date compilers. (Thanks Santiago M. Mola)
116 
117 Since 1.2.0:
118  - Abrubtly terminated multibyte sequences no longer
119  cause htmlrecode to enter an infinite loop
120 
121 Since 1.1.5:
122  - Tags are now recognized in all mixed case
123  - Tag values can be in '', not only in \"\"
124  - -:_. are recognized to be part of tag value if no \"\" is there
125  - Nonspace are also recognized as above :((unless -s option was used)
126  - SCRIPT and STYLE contents are \"raw\" until the next &lt;/, unless -s was used
127  - SCRIPT/STYLE contents are properly rehidden if necessary
128  - \" and ' quotes (and no quotes) are used wisely
129  - Warnings from some bad HTML
130  - Indentations inside tags are now kept mostly intact
131  - XHTML support
132  - Unicode signature character support
133  - Major structural rewrites
134  - New \"configure\" script
135 
136  - Big thanks to Winfried Szukalski for
137  his thorough testing efforts and comments.
138 
139 Since 1.1.4:
140  - workaround for g++ versions, now compiles with g++-3
141 
142 Since 1.1.3:
143  - optimizations
144  - error resistence
145 
146 Since 1.1.2:
147  - hex support
148  - g++ string workarounds
149 
150 Since 1.1.1:
151  - improved documentation
152  - fixed &lt; (was outputted as &amp;gt;, should be &amp;lt;)
153 
154 </pre>
155 
156 ", '1. Copying' => "
157 
158 htmlrecode has been written by Joel Yliluoma, a.k.a.
159 <a href=\"http://iki.fi/bisqwit/\">Bisqwit</a>,<br>
160 and is distributed under the terms of the
161 <a href=\"http://www.gnu.org/licenses/licenses.html#GPL\">General Public License</a> (GPL).
162 
163 ");
164 include '/WWW/progdesc.php';
$text
$text
Definition: progdesc.php:10
$outset
$outset
Definition: progdesc.php:4
$title
$title
Definition: progdesc.php:4
$progname
$progname
Definition: progdesc.php:5