"Fossies" - the Fresh Open Source Software Archive

Member "libraries/idna_convert/ReadMe.txt" (12 Sep 2021, 7656 Bytes) of package /linux/www/Joomla_3.10.2-Stable-Full_Package.tar.bz2:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 *******************************************************************************
    2 *                                                                             *
    3 *                    IDNA Convert (idna_convert.class.php)                    *
    4 *                                                                             *
    5 * http://idnaconv.phlymail.de                     mailto:phlymail@phlylabs.de *
    6 *******************************************************************************
    7 * (c) 2004-2011 phlyLabs, Berlin                                              *
    8 * This file is encoded in UTF-8                                               *
    9 *******************************************************************************
   10 
   11 Introduction
   12 ------------
   13 
   14 The class idna_convert allows to convert internationalized domain names
   15 (see RFC 3490, 3491, 3492 and 3454 for detials) as they can be used with various
   16 registries worldwide to be translated between their original (localized) form
   17 and their encoded form as it will be used in the DNS (Domain Name System).
   18 
   19 The class provides two public methods, encode() and decode(), which do exactly
   20 what you would expect them to do. You are allowed to use complete domain names,
   21 simple strings and complete email addresses as well. That means, that you might
   22 use any of the following notations:
   23 
   24 - www.nörgler.com
   25 - xn--nrgler-wxa
   26 - xn--brse-5qa.xn--knrz-1ra.info
   27 
   28 Errors, incorrectly encoded or invalid strings will lead to either a FALSE
   29 response (when in strict mode) or to only partially converted strings.
   30 You can query the occured error by calling the method get_last_error().
   31 
   32 Unicode strings are expected to be either UTF-8 strings, UCS-4 strings or UCS-4
   33 arrays. The default format is UTF-8. For setting different encodings, you can
   34 call the method setParams() - please see the inline documentation for details.
   35 ACE strings (the Punycode form) are always 7bit ASCII strings.
   36 
   37 ATTENTION: As of version 0.6.0 this class is written in the OOP style of PHP5.
   38 Since PHP4 is no longer actively maintained, you should switch to PHP5 as fast as
   39 possible.
   40 We expect to see no compatibility issues with the upcoming PHP6, too.
   41 
   42 ATTENTION: BC break! As of version 0.6.4 the class per default allows the German
   43 ligature ß to be encoded as the DeNIC, the registry for .DE allows domains
   44 containing ß.
   45 In older builds "ß" was mapped to "ss". Should you still need this behaviour,
   46 see example 5 below.
   47 
   48 ATTENTION: As of version 0.8.0 the class fully supports IDNA 2008. Thus the 
   49 aforementioned parameter is deprecated and replaced by a parameter to switch
   50 between the standards. See the updated example 5 below.
   51 
   52 Files
   53 -----
   54 idna_convert.class.php         - The actual class
   55 example.php                    - An example web page for converting
   56 transcode_wrapper.php          - Convert various encodings, see below
   57 uctc.php                       - phlyLabs' Unicode Transcoder, see below
   58 ReadMe.txt                     - This file
   59 LICENCE                        - The LGPL licence file
   60 
   61 The class is contained in idna_convert.class.php.
   62 
   63 
   64 Examples
   65 --------
   66 1. Say we wish to encode the domain name nörgler.com:
   67 
   68 // Include the class
   69 require_once('idna_convert.class.php');
   70 // Instantiate it
   71 $IDN = new idna_convert();
   72 // The input string, if input is not UTF-8 or UCS-4, it must be converted before
   73 $input = utf8_encode('nörgler.com');
   74 // Encode it to its punycode presentation
   75 $output = $IDN->encode($input);
   76 // Output, what we got now
   77 echo $output; // This will read: xn--nrgler-wxa.com
   78 
   79 
   80 2. We received an email from a punycoded domain and are willing to learn, how
   81    the domain name reads originally
   82 
   83 // Include the class
   84 require_once('idna_convert.class.php');
   85 // Instantiate it
   86 $IDN = new idna_convert();
   87 // The input string
   88 $input = 'andre@xn--brse-5qa.xn--knrz-1ra.info';
   89 // Encode it to its punycode presentation
   90 $output = $IDN->decode($input);
   91 // Output, what we got now, if output should be in a format different to UTF-8
   92 // or UCS-4, you will have to convert it before outputting it
   93 echo utf8_decode($output); // This will read: andre@börse.knörz.info
   94 
   95 
   96 3. The input is read from a UCS-4 coded file and encoded line by line. By
   97    appending the optional second parameter we tell enode() about the input
   98    format to be used
   99 
  100 // Include the class
  101 require_once('idna_convert.class.php');
  102 // Instantiate it
  103 $IDN = new dinca_convert();
  104 // Iterate through the input file line by line
  105 foreach (file('ucs4-domains.txt') as $line) {
  106     echo $IDN->encode(trim($line), 'ucs4_string');
  107     echo "\n";
  108 }
  109 
  110 
  111 4. We wish to convert a whole URI into the IDNA form, but leave the path or
  112    query string component of it alone. Just using encode() would lead to mangled
  113    paths or query strings. Here the public method encode_uri() comes into play:
  114 
  115 // Include the class
  116 require_once('idna_convert.class.php');
  117 // Instantiate it
  118 $IDN = new idna_convert();
  119 // The input string, a whole URI in UTF-8 (!)
  120 $input = 'http://nörgler:secret@nörgler.com/my_päth_is_not_ÄSCII/');
  121 // Encode it to its punycode presentation
  122 $output = $IDN->encode_uri($input);
  123 // Output, what we got now
  124 echo $output; // http://nörgler:secret@xn--nrgler-wxa.com/my_päth_is_not_ÄSCII/
  125 
  126 
  127 5. To support IDNA 2008, the class needs to be invoked with an additional
  128    parameter. This can also be achieved on an instance.
  129 
  130 // Include the class
  131 require_once('idna_convert.class.php');
  132 // Instantiate it
  133 $IDN = new idna_convert(array('idn_version' => 2008));
  134 // Sth. containing the German letter ß
  135 $input = 'meine-straße.de');
  136 // Encode it to its punycode presentation
  137 $output = $IDN->encode_uri($input);
  138 // Output, what we got now
  139 echo $output; // xn--meine-strae-46a.de
  140 // Switch back to old IDNA 2003, the original standard
  141 $IDN->set_parameter('idn_version', 2003);
  142 // Sth. containing the German letter ß
  143 $input = 'meine-straße.de');
  144 // Encode it to its punycode presentation
  145 $output = $IDN->encode_uri($input);
  146 // Output, what we got now
  147 echo $output; // meine-strasse.de
  148 
  149 
  150 Transcode wrapper
  151 -----------------
  152 In case you have strings in different encoding than ISO-8859-1 and UTF-8 you might need to
  153 translate these strings to UTF-8 before feeding the IDNA converter with it.
  154 PHP's built in functions utf8_encode() and utf8_decode() can only deal with ISO-8859-1.
  155 Use the file transcode_wrapper.php for the conversion. It requires either iconv, libiconv
  156 or mbstring installed together with one of the relevant PHP extensions.
  157 The functions you will find useful are
  158 encode_utf8() as a replacement for utf8_encode() and
  159 decode_utf8() as a replacement for utf8_decode().
  160 
  161 Example usage:
  162 <?php
  163 require_once('idna_convert.class.php');
  164 require_once('transcode_wrapper.php');
  165 $mystring = '<something in e.g. ISO-8859-15';
  166 $mystring = encode_utf8($mystring, 'ISO-8859-15');
  167 echo $IDN->encode($mystring);
  168 ?>
  169 
  170 
  171 UCTC - Unicode Transcoder
  172 -------------------------
  173 Another class you might find useful when dealing with one or more of the Unicode encoding
  174 flavours. The class is static, it requires PHP5. It can transcode into each other:
  175 - UCS-4 string / array
  176 - UTF-8
  177 - UTF-7
  178 - UTF-7 IMAP (modified UTF-7)
  179 All encodings expect / return a string in the given format, with one major exception:
  180 UCS-4 array is jsut an array, where each value represents one codepoint in the string, i.e.
  181 every value is a 32bit integer value.
  182 
  183 Example usage:
  184 <?php
  185 require_once('uctc.php');
  186 $mystring = 'nörgler.com';
  187 echo uctc::convert($mystring, 'utf8', 'utf7imap');
  188 ?>
  189 
  190 
  191 Contact us
  192 ----------
  193 In case of errors, bugs, questions, wishes, please don't hesitate to contact us
  194 under the email address above.
  195 
  196 The team of phlyLabs
  197 http://phlylabs.de
  198 mailto:phlymail@phlylabs.de