Revision 1378
Added by Luisehahne almost 15 years ago
| branches/2.8.x/CHANGELOG | ||
|---|---|---|
| 11 | 11 |
! = Update/Change |
| 12 | 12 |
|
| 13 | 13 |
------------------------------------- 2.8.2 ------------------------------------- |
| 14 |
13 Jan-2011 Build 1378 Werner von den Decken (DarkViper) |
|
| 15 |
! fixed inclusion of SecureForm |
|
| 16 |
+ added IDNA/Punycode to wb::validate_email() |
|
| 14 | 17 |
11 Jan-2011 Build 1377 Frank Heyne (FrankH) |
| 15 | 18 |
# Security fix for modules jsadmin, menu_link and output_filter |
| 16 | 19 |
11 Jan-2011 Build 1376 Frank Heyne (FrankH) |
| branches/2.8.x/wb/include/idna_convert/example.php | ||
|---|---|---|
| 1 |
<?php |
|
| 2 |
$encoded = $decoded = $add = ''; |
|
| 3 |
header('Content-Type: text/html; charset=utf-8');
|
|
| 4 |
require_once('idna_convert.class.php');
|
|
| 5 |
$IDN = new idna_convert(); |
|
| 6 |
if (isset($_REQUEST['encode'])) {
|
|
| 7 |
$decoded = isset($_REQUEST['decoded']) ? stripslashes($_REQUEST['decoded']) : ''; |
|
| 8 |
$encoded = $IDN->encode($decoded); |
|
| 9 |
} |
|
| 10 |
if (isset($_REQUEST['decode'])) {
|
|
| 11 |
$encoded = isset($_REQUEST['encoded']) ? stripslashes($_REQUEST['encoded']) : ''; |
|
| 12 |
$decoded = $IDN->decode($encoded); |
|
| 13 |
} |
|
| 14 |
$lang = 'en'; |
|
| 15 |
if (isset($_REQUEST['lang'])) {
|
|
| 16 |
if ('de' == $_REQUEST['lang'] || 'en' == $_REQUEST['lang']) $lang = $_REQUEST['lang'];
|
|
| 17 |
$add .= '<input type="hidden" name="lang" value="'.$_REQUEST['lang'].'" />'."\n"; |
|
| 18 |
} |
|
| 19 |
?> |
|
| 20 |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
|
| 21 |
<html xmlns="http://www.w3.org/1999/xhtml"> |
|
| 22 |
<head> |
|
| 23 |
<title>phlyLabs Punycode Converter</title> |
|
| 24 |
<meta name="author" content="phlyLabs" /> |
|
| 25 |
<meta http-equiv="content-type" content="text/html; charset=utf-8" /> |
|
| 26 |
<style type="text/css"> |
|
| 27 |
/*<![CDATA[*/ |
|
| 28 |
body { color:black;background:white;font-size:10pt;font-family:Verdana,Helvetica,Sans-Serif; }
|
|
| 29 |
body, form { margin:0; }
|
|
| 30 |
form { display:inline; }
|
|
| 31 |
input { font-size:8pt;font-family:Verdana,Helvetica,Sans-Serif; }
|
|
| 32 |
#round { width:730px;padding:10px;background-color:rgb(230,230,240);border:1px solid black;text-align:center;vertical-align:middle;margin:auto;margin-top:50px; }
|
|
| 33 |
th { font-size:9pt;font-weight:bold; }
|
|
| 34 |
#copy { font-size:8pt;color:rgb(60,60,80); }
|
|
| 35 |
#subhead { font-size:8pt; }
|
|
| 36 |
#bla { font-size:8pt;text-align:left; }
|
|
| 37 |
h5 {margin:0;font-size:11pt;font-weight:bold;}
|
|
| 38 |
/*]]>*/ |
|
| 39 |
</style> |
|
| 40 |
</head> |
|
| 41 |
<body> |
|
| 42 |
<div id="round"> |
|
| 43 |
<h5>phlyLabs' pure PHP IDNA Converter</h5><br /> |
|
| 44 |
<span id="subhead"> |
|
| 45 |
See <a href="http://faqs.org/rfcs/rfc3490.html" title="IDNA" target="_blank">RFC3490</a>, |
|
| 46 |
<a href="http://faqs.org/rfcs/rfc3491.html" title="Nameprep, a Stringprep profile" target="_blank">RFC3491</a>, |
|
| 47 |
<a href="http://faqs.org/rfcs/rfc3492.html" title="Punycode" target="_blank">RFC3492</a> and |
|
| 48 |
<a href="http://faqs.org/rfcs/rfc3454.html" title="Stringprep" target="_blank">RFC3454</a><br /> |
|
| 49 |
</span> |
|
| 50 |
<br /> |
|
| 51 |
<div id="bla"><?php if ($lang == 'de') { ?>
|
|
| 52 |
Dieser Konverter erlaubt die ?bersetzung von Domainnamen zwischen der Punycode- und der |
|
| 53 |
Unicode-Schreibweise.<br /> |
|
| 54 |
Geben Sie einfach den Domainnamen im entsprechend bezeichneten Feld ein und klicken Sie dann auf den darunter |
|
| 55 |
liegenden Button. Sie k?nnen einfache Domainnamen, komplette URLs (wie http://j?rgen-m?ller.de) |
|
| 56 |
oder Emailadressen eingeben.<br /> |
|
| 57 |
<br /> |
|
| 58 |
Stellen Sie aber sicher, dass Ihr Browser den Zeichensatz <strong>UTF-8</strong> unterst?tzt.<br /> |
|
| 59 |
<br /> |
|
| 60 |
Wenn Sie Interesse an der zugrundeliegenden PHP-Klasse haben, k?nnen Sie diese |
|
| 61 |
<a href="http://phlymail.com/de/downloads/idna/download/">hier herunterladen</a>.<br /> |
|
| 62 |
<br /> |
|
| 63 |
Diese Klasse wird ohne Garantie ihrer Funktionst?chtigkeit bereit gestellt. Nutzung auf eigene Gefahr.<br /> |
|
| 64 |
Um sicher zu stellen, dass eine Zeichenkette korrekt umgewandelt wurde, sollten Sie diese immer zur?ckwandeln |
|
| 65 |
und das Ergebnis mit Ihrer urspr?nglichen Eingabe vergleichen.<br /> |
|
| 66 |
<br /> |
|
| 67 |
Fehler und Probleme k?nnen Sie gern an <a href="mailto:team@phlymail.de">team@phlymail.de</a> senden.<br /> |
|
| 68 |
<?php } else { ?>
|
|
| 69 |
This converter allows you to transfer domain names between the encoded (Punycode) notation |
|
| 70 |
and the decoded (UTF-8) notation.<br /> |
|
| 71 |
Just enter the domain name in the respective field and click on the button right below it to have |
|
| 72 |
it converted. Please note, that you might even enter complete domain names (like jürgen-müller.de) |
|
| 73 |
or a email addresses.<br /> |
|
| 74 |
<br /> |
|
| 75 |
Make sure, that your browser is capable of the <strong>UTF-8</strong> character encoding.<br /> |
|
| 76 |
<br /> |
|
| 77 |
For those of you interested in the PHP source of the underlying class, you might |
|
| 78 |
<a href="http://phlymail.com/en/downloads/idna/download/">download it here</a>.<br /> |
|
| 79 |
<br /> |
|
| 80 |
Please be aware, that this class is provided as is and without any liability. Use at your own risk.<br /> |
|
| 81 |
To ensure, that a certain string has been converted correctly, you should convert it both ways and compare the |
|
| 82 |
results.<br /> |
|
| 83 |
<br /> |
|
| 84 |
Please feel free to report bugs and problems to: <a href="mailto:team@phlymail.com">team@phlymail.com</a>.<br /> |
|
| 85 |
<?php } ?> |
|
| 86 |
<br /> |
|
| 87 |
</div> |
|
| 88 |
<table border="0" cellpadding="2" cellspacing="2" align="center"> |
|
| 89 |
<thead> |
|
| 90 |
<tr> |
|
| 91 |
<th align="left">Original (Unicode)</th> |
|
| 92 |
<th align="right">Punycode (ACE)</th> |
|
| 93 |
</tr> |
|
| 94 |
</thead> |
|
| 95 |
<tbody> |
|
| 96 |
<tr> |
|
| 97 |
<td align="right"> |
|
| 98 |
<form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="get"> |
|
| 99 |
<input type="text" name="decoded" value="<?php echo htmlentities($decoded, null, 'UTF-8'); ?>" size="48" maxlength="255" /><br /> |
|
| 100 |
<input type="submit" name="encode" value="Encode >>" /><?php echo $add; ?> |
|
| 101 |
</form> |
|
| 102 |
</td> |
|
| 103 |
<td align="left"> |
|
| 104 |
<form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="get"> |
|
| 105 |
<input type="text" name="encoded" value="<?php echo htmlentities($encoded, null, 'UTF-8'); ?>" size="48" maxlength="255" /><br /> |
|
| 106 |
<input type="submit" name="decode" value="<< Decode" /><?php echo $add; ?> |
|
| 107 |
</form> |
|
| 108 |
</td> |
|
| 109 |
</tr> |
|
| 110 |
</tbody> |
|
| 111 |
</table> |
|
| 112 |
<br /> |
|
| 113 |
<span id="copy">Version used: 0.6.9; © 2004-2010 phlyLabs Berlin; part of <a href="http://phlymail.com/">phlyMail</a></span> |
|
| 114 |
</div> |
|
| 115 |
</body> |
|
| 116 |
</html> |
|
| 0 | 117 | |
| branches/2.8.x/wb/include/idna_convert/LICENCE | ||
|---|---|---|
| 1 |
GNU LESSER GENERAL PUBLIC LICENSE |
|
| 2 |
Version 2.1, February 1999 |
|
| 3 |
|
|
| 4 |
Copyright (C) 1991, 1999 Free Software Foundation, Inc. |
|
| 5 |
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
|
| 6 |
Everyone is permitted to copy and distribute verbatim copies |
|
| 7 |
of this license document, but changing it is not allowed. |
|
| 8 |
|
|
| 9 |
[This is the first released version of the Lesser GPL. It also counts |
|
| 10 |
as the successor of the GNU Library Public License, version 2, hence |
|
| 11 |
the version number 2.1.] |
|
| 12 |
|
|
| 13 |
Preamble |
|
| 14 |
|
|
| 15 |
The licenses for most software are designed to take away your |
|
| 16 |
freedom to share and change it. By contrast, the GNU General Public |
|
| 17 |
Licenses are intended to guarantee your freedom to share and change |
|
| 18 |
free software--to make sure the software is free for all its users. |
|
| 19 |
|
|
| 20 |
This license, the Lesser General Public License, applies to some |
|
| 21 |
specially designated software packages--typically libraries--of the |
|
| 22 |
Free Software Foundation and other authors who decide to use it. You |
|
| 23 |
can use it too, but we suggest you first think carefully about whether |
|
| 24 |
this license or the ordinary General Public License is the better |
|
| 25 |
strategy to use in any particular case, based on the explanations below. |
|
| 26 |
|
|
| 27 |
When we speak of free software, we are referring to freedom of use, |
|
| 28 |
not price. Our General Public Licenses are designed to make sure that |
|
| 29 |
you have the freedom to distribute copies of free software (and charge |
|
| 30 |
for this service if you wish); that you receive source code or can get |
|
| 31 |
it if you want it; that you can change the software and use pieces of |
|
| 32 |
it in new free programs; and that you are informed that you can do |
|
| 33 |
these things. |
|
| 34 |
|
|
| 35 |
To protect your rights, we need to make restrictions that forbid |
|
| 36 |
distributors to deny you these rights or to ask you to surrender these |
|
| 37 |
rights. These restrictions translate to certain responsibilities for |
|
| 38 |
you if you distribute copies of the library or if you modify it. |
|
| 39 |
|
|
| 40 |
For example, if you distribute copies of the library, whether gratis |
|
| 41 |
or for a fee, you must give the recipients all the rights that we gave |
|
| 42 |
you. You must make sure that they, too, receive or can get the source |
|
| 43 |
code. If you link other code with the library, you must provide |
|
| 44 |
complete object files to the recipients, so that they can relink them |
|
| 45 |
with the library after making changes to the library and recompiling |
|
| 46 |
it. And you must show them these terms so they know their rights. |
|
| 47 |
|
|
| 48 |
We protect your rights with a two-step method: (1) we copyright the |
|
| 49 |
library, and (2) we offer you this license, which gives you legal |
|
| 50 |
permission to copy, distribute and/or modify the library. |
|
| 51 |
|
|
| 52 |
To protect each distributor, we want to make it very clear that |
|
| 53 |
there is no warranty for the free library. Also, if the library is |
|
| 54 |
modified by someone else and passed on, the recipients should know |
|
| 55 |
that what they have is not the original version, so that the original |
|
| 56 |
author's reputation will not be affected by problems that might be |
|
| 57 |
introduced by others. |
|
| 58 |
|
|
| 59 |
Finally, software patents pose a constant threat to the existence of |
|
| 60 |
any free program. We wish to make sure that a company cannot |
|
| 61 |
effectively restrict the users of a free program by obtaining a |
|
| 62 |
restrictive license from a patent holder. Therefore, we insist that |
|
| 63 |
any patent license obtained for a version of the library must be |
|
| 64 |
consistent with the full freedom of use specified in this license. |
|
| 65 |
|
|
| 66 |
Most GNU software, including some libraries, is covered by the |
|
| 67 |
ordinary GNU General Public License. This license, the GNU Lesser |
|
| 68 |
General Public License, applies to certain designated libraries, and |
|
| 69 |
is quite different from the ordinary General Public License. We use |
|
| 70 |
this license for certain libraries in order to permit linking those |
|
| 71 |
libraries into non-free programs. |
|
| 72 |
|
|
| 73 |
When a program is linked with a library, whether statically or using |
|
| 74 |
a shared library, the combination of the two is legally speaking a |
|
| 75 |
combined work, a derivative of the original library. The ordinary |
|
| 76 |
General Public License therefore permits such linking only if the |
|
| 77 |
entire combination fits its criteria of freedom. The Lesser General |
|
| 78 |
Public License permits more lax criteria for linking other code with |
|
| 79 |
the library. |
|
| 80 |
|
|
| 81 |
We call this license the "Lesser" General Public License because it |
|
| 82 |
does Less to protect the user's freedom than the ordinary General |
|
| 83 |
Public License. It also provides other free software developers Less |
|
| 84 |
of an advantage over competing non-free programs. These disadvantages |
|
| 85 |
are the reason we use the ordinary General Public License for many |
|
| 86 |
libraries. However, the Lesser license provides advantages in certain |
|
| 87 |
special circumstances. |
|
| 88 |
|
|
| 89 |
For example, on rare occasions, there may be a special need to |
|
| 90 |
encourage the widest possible use of a certain library, so that it becomes |
|
| 91 |
a de-facto standard. To achieve this, non-free programs must be |
|
| 92 |
allowed to use the library. A more frequent case is that a free |
|
| 93 |
library does the same job as widely used non-free libraries. In this |
|
| 94 |
case, there is little to gain by limiting the free library to free |
|
| 95 |
software only, so we use the Lesser General Public License. |
|
| 96 |
|
|
| 97 |
In other cases, permission to use a particular library in non-free |
|
| 98 |
programs enables a greater number of people to use a large body of |
|
| 99 |
free software. For example, permission to use the GNU C Library in |
|
| 100 |
non-free programs enables many more people to use the whole GNU |
|
| 101 |
operating system, as well as its variant, the GNU/Linux operating |
|
| 102 |
system. |
|
| 103 |
|
|
| 104 |
Although the Lesser General Public License is Less protective of the |
|
| 105 |
users' freedom, it does ensure that the user of a program that is |
|
| 106 |
linked with the Library has the freedom and the wherewithal to run |
|
| 107 |
that program using a modified version of the Library. |
|
| 108 |
|
|
| 109 |
The precise terms and conditions for copying, distribution and |
|
| 110 |
modification follow. Pay close attention to the difference between a |
|
| 111 |
"work based on the library" and a "work that uses the library". The |
|
| 112 |
former contains code derived from the library, whereas the latter must |
|
| 113 |
be combined with the library in order to run. |
|
| 114 |
|
|
| 115 |
GNU LESSER GENERAL PUBLIC LICENSE |
|
| 116 |
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION |
|
| 117 |
|
|
| 118 |
0. This License Agreement applies to any software library or other |
|
| 119 |
program which contains a notice placed by the copyright holder or |
|
| 120 |
other authorized party saying it may be distributed under the terms of |
|
| 121 |
this Lesser General Public License (also called "this License"). |
|
| 122 |
Each licensee is addressed as "you". |
|
| 123 |
|
|
| 124 |
A "library" means a collection of software functions and/or data |
|
| 125 |
prepared so as to be conveniently linked with application programs |
|
| 126 |
(which use some of those functions and data) to form executables. |
|
| 127 |
|
|
| 128 |
The "Library", below, refers to any such software library or work |
|
| 129 |
which has been distributed under these terms. A "work based on the |
|
| 130 |
Library" means either the Library or any derivative work under |
|
| 131 |
copyright law: that is to say, a work containing the Library or a |
|
| 132 |
portion of it, either verbatim or with modifications and/or translated |
|
| 133 |
straightforwardly into another language. (Hereinafter, translation is |
|
| 134 |
included without limitation in the term "modification".) |
|
| 135 |
|
|
| 136 |
"Source code" for a work means the preferred form of the work for |
|
| 137 |
making modifications to it. For a library, complete source code means |
|
| 138 |
all the source code for all modules it contains, plus any associated |
|
| 139 |
interface definition files, plus the scripts used to control compilation |
|
| 140 |
and installation of the library. |
|
| 141 |
|
|
| 142 |
Activities other than copying, distribution and modification are not |
|
| 143 |
covered by this License; they are outside its scope. The act of |
|
| 144 |
running a program using the Library is not restricted, and output from |
|
| 145 |
such a program is covered only if its contents constitute a work based |
|
| 146 |
on the Library (independent of the use of the Library in a tool for |
|
| 147 |
writing it). Whether that is true depends on what the Library does |
|
| 148 |
and what the program that uses the Library does. |
|
| 149 |
|
|
| 150 |
1. You may copy and distribute verbatim copies of the Library's |
|
| 151 |
complete source code as you receive it, in any medium, provided that |
|
| 152 |
you conspicuously and appropriately publish on each copy an |
|
| 153 |
appropriate copyright notice and disclaimer of warranty; keep intact |
|
| 154 |
all the notices that refer to this License and to the absence of any |
|
| 155 |
warranty; and distribute a copy of this License along with the |
|
| 156 |
Library. |
|
| 157 |
|
|
| 158 |
You may charge a fee for the physical act of transferring a copy, |
|
| 159 |
and you may at your option offer warranty protection in exchange for a |
|
| 160 |
fee. |
|
| 161 |
|
|
| 162 |
2. You may modify your copy or copies of the Library or any portion |
|
| 163 |
of it, thus forming a work based on the Library, and copy and |
|
| 164 |
distribute such modifications or work under the terms of Section 1 |
|
| 165 |
above, provided that you also meet all of these conditions: |
|
| 166 |
|
|
| 167 |
a) The modified work must itself be a software library. |
|
| 168 |
|
|
| 169 |
b) You must cause the files modified to carry prominent notices |
|
| 170 |
stating that you changed the files and the date of any change. |
|
| 171 |
|
|
| 172 |
c) You must cause the whole of the work to be licensed at no |
|
| 173 |
charge to all third parties under the terms of this License. |
|
| 174 |
|
|
| 175 |
d) If a facility in the modified Library refers to a function or a |
|
| 176 |
table of data to be supplied by an application program that uses |
|
| 177 |
the facility, other than as an argument passed when the facility |
|
| 178 |
is invoked, then you must make a good faith effort to ensure that, |
|
| 179 |
in the event an application does not supply such function or |
|
| 180 |
table, the facility still operates, and performs whatever part of |
|
| 181 |
its purpose remains meaningful. |
|
| 182 |
|
|
| 183 |
(For example, a function in a library to compute square roots has |
|
| 184 |
a purpose that is entirely well-defined independent of the |
|
| 185 |
application. Therefore, Subsection 2d requires that any |
|
| 186 |
application-supplied function or table used by this function must |
|
| 187 |
be optional: if the application does not supply it, the square |
|
| 188 |
root function must still compute square roots.) |
|
| 189 |
|
|
| 190 |
These requirements apply to the modified work as a whole. If |
|
| 191 |
identifiable sections of that work are not derived from the Library, |
|
| 192 |
and can be reasonably considered independent and separate works in |
|
| 193 |
themselves, then this License, and its terms, do not apply to those |
|
| 194 |
sections when you distribute them as separate works. But when you |
|
| 195 |
distribute the same sections as part of a whole which is a work based |
|
| 196 |
on the Library, the distribution of the whole must be on the terms of |
|
| 197 |
this License, whose permissions for other licensees extend to the |
|
| 198 |
entire whole, and thus to each and every part regardless of who wrote |
|
| 199 |
it. |
|
| 200 |
|
|
| 201 |
Thus, it is not the intent of this section to claim rights or contest |
|
| 202 |
your rights to work written entirely by you; rather, the intent is to |
|
| 203 |
exercise the right to control the distribution of derivative or |
|
| 204 |
collective works based on the Library. |
|
| 205 |
|
|
| 206 |
In addition, mere aggregation of another work not based on the Library |
|
| 207 |
with the Library (or with a work based on the Library) on a volume of |
|
| 208 |
a storage or distribution medium does not bring the other work under |
|
| 209 |
the scope of this License. |
|
| 210 |
|
|
| 211 |
3. You may opt to apply the terms of the ordinary GNU General Public |
|
| 212 |
License instead of this License to a given copy of the Library. To do |
|
| 213 |
this, you must alter all the notices that refer to this License, so |
|
| 214 |
that they refer to the ordinary GNU General Public License, version 2, |
|
| 215 |
instead of to this License. (If a newer version than version 2 of the |
|
| 216 |
ordinary GNU General Public License has appeared, then you can specify |
|
| 217 |
that version instead if you wish.) Do not make any other change in |
|
| 218 |
these notices. |
|
| 219 |
|
|
| 220 |
Once this change is made in a given copy, it is irreversible for |
|
| 221 |
that copy, so the ordinary GNU General Public License applies to all |
|
| 222 |
subsequent copies and derivative works made from that copy. |
|
| 223 |
|
|
| 224 |
This option is useful when you wish to copy part of the code of |
|
| 225 |
the Library into a program that is not a library. |
|
| 226 |
|
|
| 227 |
4. You may copy and distribute the Library (or a portion or |
|
| 228 |
derivative of it, under Section 2) in object code or executable form |
|
| 229 |
under the terms of Sections 1 and 2 above provided that you accompany |
|
| 230 |
it with the complete corresponding machine-readable source code, which |
|
| 231 |
must be distributed under the terms of Sections 1 and 2 above on a |
|
| 232 |
medium customarily used for software interchange. |
|
| 233 |
|
|
| 234 |
If distribution of object code is made by offering access to copy |
|
| 235 |
from a designated place, then offering equivalent access to copy the |
|
| 236 |
source code from the same place satisfies the requirement to |
|
| 237 |
distribute the source code, even though third parties are not |
|
| 238 |
compelled to copy the source along with the object code. |
|
| 239 |
|
|
| 240 |
5. A program that contains no derivative of any portion of the |
|
| 241 |
Library, but is designed to work with the Library by being compiled or |
|
| 242 |
linked with it, is called a "work that uses the Library". Such a |
|
| 243 |
work, in isolation, is not a derivative work of the Library, and |
|
| 244 |
therefore falls outside the scope of this License. |
|
| 245 |
|
|
| 246 |
However, linking a "work that uses the Library" with the Library |
|
| 247 |
creates an executable that is a derivative of the Library (because it |
|
| 248 |
contains portions of the Library), rather than a "work that uses the |
|
| 249 |
library". The executable is therefore covered by this License. |
|
| 250 |
Section 6 states terms for distribution of such executables. |
|
| 251 |
|
|
| 252 |
When a "work that uses the Library" uses material from a header file |
|
| 253 |
that is part of the Library, the object code for the work may be a |
|
| 254 |
derivative work of the Library even though the source code is not. |
|
| 255 |
Whether this is true is especially significant if the work can be |
|
| 256 |
linked without the Library, or if the work is itself a library. The |
|
| 257 |
threshold for this to be true is not precisely defined by law. |
|
| 258 |
|
|
| 259 |
If such an object file uses only numerical parameters, data |
|
| 260 |
structure layouts and accessors, and small macros and small inline |
|
| 261 |
functions (ten lines or less in length), then the use of the object |
|
| 262 |
file is unrestricted, regardless of whether it is legally a derivative |
|
| 263 |
work. (Executables containing this object code plus portions of the |
|
| 264 |
Library will still fall under Section 6.) |
|
| 265 |
|
|
| 266 |
Otherwise, if the work is a derivative of the Library, you may |
|
| 267 |
distribute the object code for the work under the terms of Section 6. |
|
| 268 |
Any executables containing that work also fall under Section 6, |
|
| 269 |
whether or not they are linked directly with the Library itself. |
|
| 270 |
|
|
| 271 |
6. As an exception to the Sections above, you may also combine or |
|
| 272 |
link a "work that uses the Library" with the Library to produce a |
|
| 273 |
work containing portions of the Library, and distribute that work |
|
| 274 |
under terms of your choice, provided that the terms permit |
|
| 275 |
modification of the work for the customer's own use and reverse |
|
| 276 |
engineering for debugging such modifications. |
|
| 277 |
|
|
| 278 |
You must give prominent notice with each copy of the work that the |
|
| 279 |
Library is used in it and that the Library and its use are covered by |
|
| 280 |
this License. You must supply a copy of this License. If the work |
|
| 281 |
during execution displays copyright notices, you must include the |
|
| 282 |
copyright notice for the Library among them, as well as a reference |
|
| 283 |
directing the user to the copy of this License. Also, you must do one |
|
| 284 |
of these things: |
|
| 285 |
|
|
| 286 |
a) Accompany the work with the complete corresponding |
|
| 287 |
machine-readable source code for the Library including whatever |
|
| 288 |
changes were used in the work (which must be distributed under |
|
| 289 |
Sections 1 and 2 above); and, if the work is an executable linked |
|
| 290 |
with the Library, with the complete machine-readable "work that |
|
| 291 |
uses the Library", as object code and/or source code, so that the |
|
| 292 |
user can modify the Library and then relink to produce a modified |
|
| 293 |
executable containing the modified Library. (It is understood |
|
| 294 |
that the user who changes the contents of definitions files in the |
|
| 295 |
Library will not necessarily be able to recompile the application |
|
| 296 |
to use the modified definitions.) |
|
| 297 |
|
|
| 298 |
b) Use a suitable shared library mechanism for linking with the |
|
| 299 |
Library. A suitable mechanism is one that (1) uses at run time a |
|
| 300 |
copy of the library already present on the user's computer system, |
|
| 301 |
rather than copying library functions into the executable, and (2) |
|
| 302 |
will operate properly with a modified version of the library, if |
|
| 303 |
the user installs one, as long as the modified version is |
|
| 304 |
interface-compatible with the version that the work was made with. |
|
| 305 |
|
|
| 306 |
c) Accompany the work with a written offer, valid for at |
|
| 307 |
least three years, to give the same user the materials |
|
| 308 |
specified in Subsection 6a, above, for a charge no more |
|
| 309 |
than the cost of performing this distribution. |
|
| 310 |
|
|
| 311 |
d) If distribution of the work is made by offering access to copy |
|
| 312 |
from a designated place, offer equivalent access to copy the above |
|
| 313 |
specified materials from the same place. |
|
| 314 |
|
|
| 315 |
e) Verify that the user has already received a copy of these |
|
| 316 |
materials or that you have already sent this user a copy. |
|
| 317 |
|
|
| 318 |
For an executable, the required form of the "work that uses the |
|
| 319 |
Library" must include any data and utility programs needed for |
|
| 320 |
reproducing the executable from it. However, as a special exception, |
|
| 321 |
the materials to be distributed need not include anything that is |
|
| 322 |
normally distributed (in either source or binary form) with the major |
|
| 323 |
components (compiler, kernel, and so on) of the operating system on |
|
| 324 |
which the executable runs, unless that component itself accompanies |
|
| 325 |
the executable. |
|
| 326 |
|
|
| 327 |
It may happen that this requirement contradicts the license |
|
| 328 |
restrictions of other proprietary libraries that do not normally |
|
| 329 |
accompany the operating system. Such a contradiction means you cannot |
|
| 330 |
use both them and the Library together in an executable that you |
|
| 331 |
distribute. |
|
| 332 |
|
|
| 333 |
7. You may place library facilities that are a work based on the |
|
| 334 |
Library side-by-side in a single library together with other library |
|
| 335 |
facilities not covered by this License, and distribute such a combined |
|
| 336 |
library, provided that the separate distribution of the work based on |
|
| 337 |
the Library and of the other library facilities is otherwise |
|
| 338 |
permitted, and provided that you do these two things: |
|
| 339 |
|
|
| 340 |
a) Accompany the combined library with a copy of the same work |
|
| 341 |
based on the Library, uncombined with any other library |
|
| 342 |
facilities. This must be distributed under the terms of the |
|
| 343 |
Sections above. |
|
| 344 |
|
|
| 345 |
b) Give prominent notice with the combined library of the fact |
|
| 346 |
that part of it is a work based on the Library, and explaining |
|
| 347 |
where to find the accompanying uncombined form of the same work. |
|
| 348 |
|
|
| 349 |
8. You may not copy, modify, sublicense, link with, or distribute |
|
| 350 |
the Library except as expressly provided under this License. Any |
|
| 351 |
attempt otherwise to copy, modify, sublicense, link with, or |
|
| 352 |
distribute the Library is void, and will automatically terminate your |
|
| 353 |
rights under this License. However, parties who have received copies, |
|
| 354 |
or rights, from you under this License will not have their licenses |
|
| 355 |
terminated so long as such parties remain in full compliance. |
|
| 356 |
|
|
| 357 |
9. You are not required to accept this License, since you have not |
|
| 358 |
signed it. However, nothing else grants you permission to modify or |
|
| 359 |
distribute the Library or its derivative works. These actions are |
|
| 360 |
prohibited by law if you do not accept this License. Therefore, by |
|
| 361 |
modifying or distributing the Library (or any work based on the |
|
| 362 |
Library), you indicate your acceptance of this License to do so, and |
|
| 363 |
all its terms and conditions for copying, distributing or modifying |
|
| 364 |
the Library or works based on it. |
|
| 365 |
|
|
| 366 |
10. Each time you redistribute the Library (or any work based on the |
|
| 367 |
Library), the recipient automatically receives a license from the |
|
| 368 |
original licensor to copy, distribute, link with or modify the Library |
|
| 369 |
subject to these terms and conditions. You may not impose any further |
|
| 370 |
restrictions on the recipients' exercise of the rights granted herein. |
|
| 371 |
You are not responsible for enforcing compliance by third parties with |
|
| 372 |
this License. |
|
| 373 |
|
|
| 374 |
11. If, as a consequence of a court judgment or allegation of patent |
|
| 375 |
infringement or for any other reason (not limited to patent issues), |
|
| 376 |
conditions are imposed on you (whether by court order, agreement or |
|
| 377 |
otherwise) that contradict the conditions of this License, they do not |
|
| 378 |
excuse you from the conditions of this License. If you cannot |
|
| 379 |
distribute so as to satisfy simultaneously your obligations under this |
|
| 380 |
License and any other pertinent obligations, then as a consequence you |
|
| 381 |
may not distribute the Library at all. For example, if a patent |
|
| 382 |
license would not permit royalty-free redistribution of the Library by |
|
| 383 |
all those who receive copies directly or indirectly through you, then |
|
| 384 |
the only way you could satisfy both it and this License would be to |
|
| 385 |
refrain entirely from distribution of the Library. |
|
| 386 |
|
|
| 387 |
If any portion of this section is held invalid or unenforceable under any |
|
| 388 |
particular circumstance, the balance of the section is intended to apply, |
|
| 389 |
and the section as a whole is intended to apply in other circumstances. |
|
| 390 |
|
|
| 391 |
It is not the purpose of this section to induce you to infringe any |
|
| 392 |
patents or other property right claims or to contest validity of any |
|
| 393 |
such claims; this section has the sole purpose of protecting the |
|
| 394 |
integrity of the free software distribution system which is |
|
| 395 |
implemented by public license practices. Many people have made |
|
| 396 |
generous contributions to the wide range of software distributed |
|
| 397 |
through that system in reliance on consistent application of that |
|
| 398 |
system; it is up to the author/donor to decide if he or she is willing |
|
| 399 |
to distribute software through any other system and a licensee cannot |
|
| 400 |
impose that choice. |
|
| 401 |
|
|
| 402 |
This section is intended to make thoroughly clear what is believed to |
|
| 403 |
be a consequence of the rest of this License. |
|
| 404 |
|
|
| 405 |
12. If the distribution and/or use of the Library is restricted in |
|
| 406 |
certain countries either by patents or by copyrighted interfaces, the |
|
| 407 |
original copyright holder who places the Library under this License may add |
|
| 408 |
an explicit geographical distribution limitation excluding those countries, |
|
| 409 |
so that distribution is permitted only in or among countries not thus |
|
| 410 |
excluded. In such case, this License incorporates the limitation as if |
|
| 411 |
written in the body of this License. |
|
| 412 |
|
|
| 413 |
13. The Free Software Foundation may publish revised and/or new |
|
| 414 |
versions of the Lesser General Public License from time to time. |
|
| 415 |
Such new versions will be similar in spirit to the present version, |
|
| 416 |
but may differ in detail to address new problems or concerns. |
|
| 417 |
|
|
| 418 |
Each version is given a distinguishing version number. If the Library |
|
| 419 |
specifies a version number of this License which applies to it and |
|
| 420 |
"any later version", you have the option of following the terms and |
|
| 421 |
conditions either of that version or of any later version published by |
|
| 422 |
the Free Software Foundation. If the Library does not specify a |
|
| 423 |
license version number, you may choose any version ever published by |
|
| 424 |
the Free Software Foundation. |
|
| 425 |
|
|
| 426 |
14. If you wish to incorporate parts of the Library into other free |
|
| 427 |
programs whose distribution conditions are incompatible with these, |
|
| 428 |
write to the author to ask for permission. For software which is |
|
| 429 |
copyrighted by the Free Software Foundation, write to the Free |
|
| 430 |
Software Foundation; we sometimes make exceptions for this. Our |
|
| 431 |
decision will be guided by the two goals of preserving the free status |
|
| 432 |
of all derivatives of our free software and of promoting the sharing |
|
| 433 |
and reuse of software generally. |
|
| 434 |
|
|
| 435 |
NO WARRANTY |
|
| 436 |
|
|
| 437 |
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO |
|
| 438 |
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. |
|
| 439 |
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR |
|
| 440 |
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY |
|
| 441 |
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE |
|
| 442 |
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR |
|
| 443 |
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE |
|
| 444 |
LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME |
|
| 445 |
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. |
|
| 446 |
|
|
| 447 |
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN |
|
| 448 |
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY |
|
| 449 |
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU |
|
| 450 |
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR |
|
| 451 |
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE |
|
| 452 |
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING |
|
| 453 |
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A |
|
| 454 |
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF |
|
| 455 |
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH |
|
| 456 |
DAMAGES. |
|
| 457 |
|
|
| 458 |
END OF TERMS AND CONDITIONS |
|
| 459 |
|
|
| 460 |
How to Apply These Terms to Your New Libraries |
|
| 461 |
|
|
| 462 |
If you develop a new library, and you want it to be of the greatest |
|
| 463 |
possible use to the public, we recommend making it free software that |
|
| 464 |
everyone can redistribute and change. You can do so by permitting |
|
| 465 |
redistribution under these terms (or, alternatively, under the terms of the |
|
| 466 |
ordinary General Public License). |
|
| 467 |
|
|
| 468 |
To apply these terms, attach the following notices to the library. It is |
|
| 469 |
safest to attach them to the start of each source file to most effectively |
|
| 470 |
convey the exclusion of warranty; and each file should have at least the |
|
| 471 |
"copyright" line and a pointer to where the full notice is found. |
|
| 472 |
|
|
| 473 |
<one line to give the library's name and a brief idea of what it does.> |
|
| 474 |
Copyright (C) <year> <name of author> |
|
| 475 |
|
|
| 476 |
This library is free software; you can redistribute it and/or |
|
| 477 |
modify it under the terms of the GNU Lesser General Public |
|
| 478 |
License as published by the Free Software Foundation; either |
|
| 479 |
version 2.1 of the License, or (at your option) any later version. |
|
| 480 |
|
|
| 481 |
This library is distributed in the hope that it will be useful, |
|
| 482 |
but WITHOUT ANY WARRANTY; without even the implied warranty of |
|
| 483 |
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
|
| 484 |
Lesser General Public License for more details. |
|
| 485 |
|
|
| 486 |
You should have received a copy of the GNU Lesser General Public |
|
| 487 |
License along with this library; if not, write to the Free Software |
|
| 488 |
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
|
| 489 |
|
|
| 490 |
Also add information on how to contact you by electronic and paper mail. |
|
| 491 |
|
|
| 492 |
You should also get your employer (if you work as a programmer) or your |
|
| 493 |
school, if any, to sign a "copyright disclaimer" for the library, if |
|
| 494 |
necessary. Here is a sample; alter the names: |
|
| 495 |
|
|
| 496 |
Yoyodyne, Inc., hereby disclaims all copyright interest in the |
|
| 497 |
library `Frob' (a library for tweaking knobs) written by James Random Hacker. |
|
| 498 |
|
|
| 499 |
<signature of Ty Coon>, 1 April 1990 |
|
| 500 |
Ty Coon, President of Vice |
|
| 501 |
|
|
| 502 |
That's all there is to it! |
|
| branches/2.8.x/wb/include/idna_convert/idna_convert.class.php | ||
|---|---|---|
| 1 |
<?php |
|
| 2 |
// {{{ license
|
|
| 3 |
|
|
| 4 |
/* vim: set expandtab tabstop=4 shiftwidth=4 softtabstop=4 foldmethod=marker: */ |
|
| 5 |
// |
|
| 6 |
// +----------------------------------------------------------------------+ |
|
| 7 |
// | This library is free software; you can redistribute it and/or modify | |
|
| 8 |
// | it under the terms of the GNU Lesser General Public License as | |
|
| 9 |
// | published by the Free Software Foundation; either version 2.1 of the | |
|
| 10 |
// | License, or (at your option) any later version. | |
|
| 11 |
// | | |
|
| 12 |
// | This library is distributed in the hope that it will be useful, but | |
|
| 13 |
// | WITHOUT ANY WARRANTY; without even the implied warranty of | |
|
| 14 |
// | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU | |
|
| 15 |
// | Lesser General Public License for more details. | |
|
| 16 |
// | | |
|
| 17 |
// | You should have received a copy of the GNU Lesser General Public | |
|
| 18 |
// | License along with this library; if not, write to the Free Software | |
|
| 19 |
// | Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 | |
|
| 20 |
// | USA. | |
|
| 21 |
// +----------------------------------------------------------------------+ |
|
| 22 |
// |
|
| 23 |
|
|
| 24 |
// }}} |
|
| 25 |
|
|
| 26 |
/** |
|
| 27 |
* Encode/decode Internationalized Domain Names. |
|
| 28 |
* |
|
| 29 |
* The class allows to convert internationalized domain names |
|
| 30 |
* (see RFC 3490 for details) as they can be used with various registries worldwide |
|
| 31 |
* to be translated between their original (localized) form and their encoded form |
|
| 32 |
* as it will be used in the DNS (Domain Name System). |
|
| 33 |
* |
|
| 34 |
* The class provides two public methods, encode() and decode(), which do exactly |
|
| 35 |
* what you would expect them to do. You are allowed to use complete domain names, |
|
| 36 |
* simple strings and complete email addresses as well. That means, that you might |
|
| 37 |
* use any of the following notations: |
|
| 38 |
* |
|
| 39 |
* - www.n?rgler.com |
|
| 40 |
* - xn--nrgler-wxa |
|
| 41 |
* - xn--brse-5qa.xn--knrz-1ra.info |
|
| 42 |
* |
|
| 43 |
* Unicode input might be given as either UTF-8 string, UCS-4 string or UCS-4 array. |
|
| 44 |
* Unicode output is available in the same formats. |
|
| 45 |
* You can select your preferred format via {@link set_paramter()}.
|
|
| 46 |
* |
|
| 47 |
* ACE input and output is always expected to be ASCII. |
|
| 48 |
* |
|
| 49 |
* @author Matthias Sommerfeld <mso@phlylabs.de> |
|
| 50 |
* @author Leonid Kogan <lko@neuse.de> |
|
| 51 |
* @copyright 2004-2010 phlyLabs Berlin, http://phlylabs.de |
|
| 52 |
* @version 0.6.9 2010-11-04 |
|
| 53 |
*/ |
|
| 54 |
class idna_convert |
|
| 55 |
{
|
|
| 56 |
// NP See below |
|
| 57 |
|
|
| 58 |
// Internal settings, do not mess with them |
|
| 59 |
protected $_punycode_prefix = 'xn--'; |
|
| 60 |
protected $_invalid_ucs = 0x80000000; |
|
| 61 |
protected $_max_ucs = 0x10FFFF; |
|
| 62 |
protected $_base = 36; |
|
| 63 |
protected $_tmin = 1; |
|
| 64 |
protected $_tmax = 26; |
|
| 65 |
protected $_skew = 38; |
|
| 66 |
protected $_damp = 700; |
|
| 67 |
protected $_initial_bias = 72; |
|
| 68 |
protected $_initial_n = 0x80; |
|
| 69 |
protected $_sbase = 0xAC00; |
|
| 70 |
protected $_lbase = 0x1100; |
|
| 71 |
protected $_vbase = 0x1161; |
|
| 72 |
protected $_tbase = 0x11A7; |
|
| 73 |
protected $_lcount = 19; |
|
| 74 |
protected $_vcount = 21; |
|
| 75 |
protected $_tcount = 28; |
|
| 76 |
protected $_ncount = 588; // _vcount * _tcount |
|
| 77 |
protected $_scount = 11172; // _lcount * _tcount * _vcount |
|
| 78 |
protected $_error = false; |
|
| 79 |
|
|
| 80 |
// See {@link set_paramter()} for details of how to change the following
|
|
| 81 |
// settings from within your script / application |
|
| 82 |
protected $_api_encoding = 'utf8'; // Default input charset is UTF-8 |
|
| 83 |
protected $_allow_overlong = false; // Overlong UTF-8 encodings are forbidden |
|
| 84 |
protected $_strict_mode = false; // Behave strict or not |
|
| 85 |
protected $_encode_german_sz = true; // True to encode German ?; False, if not |
|
| 86 |
|
|
| 87 |
/** |
|
| 88 |
* the constructor |
|
| 89 |
* |
|
| 90 |
* @param array $options |
|
| 91 |
* @return boolean |
|
| 92 |
* @since 0.5.2 |
|
| 93 |
*/ |
|
| 94 |
public function __construct($options = false) |
|
| 95 |
{
|
|
| 96 |
$this->slast = $this->_sbase + $this->_lcount * $this->_vcount * $this->_tcount; |
|
| 97 |
// If parameters are given, pass these to the respective method |
|
| 98 |
if (is_array($options)) return $this->set_parameter($options); |
|
| 99 |
if (!$this->_encode_german_sz) {
|
|
| 100 |
$this->NP['replacemaps'][0xDF] = array(0x73, 0x73); |
|
| 101 |
} |
|
| 102 |
} |
|
| 103 |
|
|
| 104 |
/** |
|
| 105 |
* Sets a new option value. Available options and values: |
|
| 106 |
* [encoding - Use either UTF-8, UCS4 as array or UCS4 as string as input ('utf8' for UTF-8,
|
|
| 107 |
* 'ucs4_string' and 'ucs4_array' respectively for UCS4); The output is always UTF-8] |
|
| 108 |
* [overlong - Unicode does not allow unnecessarily long encodings of chars, |
|
| 109 |
* to allow this, set this parameter to true, else to false; |
|
| 110 |
* default is false.] |
|
| 111 |
* [strict - true: strict mode, good for registration purposes - Causes errors |
|
| 112 |
* on failures; false: loose mode, ideal for "wildlife" applications |
|
| 113 |
* by silently ignoring errors and returning the original input instead |
|
| 114 |
* |
|
| 115 |
* @param mixed Parameter to set (string: single parameter; array of Parameter => Value pairs) |
|
| 116 |
* @param string Value to use (if parameter 1 is a string) |
|
| 117 |
* @return boolean true on success, false otherwise |
|
| 118 |
*/ |
|
| 119 |
public function set_parameter($option, $value = false) |
|
| 120 |
{
|
|
| 121 |
if (!is_array($option)) {
|
|
| 122 |
$option = array($option => $value); |
|
| 123 |
} |
|
| 124 |
foreach ($option as $k => $v) {
|
|
| 125 |
switch ($k) {
|
|
| 126 |
case 'encoding': |
|
| 127 |
switch ($v) {
|
|
| 128 |
case 'utf8': |
|
| 129 |
case 'ucs4_string': |
|
| 130 |
case 'ucs4_array': |
|
| 131 |
$this->_api_encoding = $v; |
|
| 132 |
break; |
|
| 133 |
default: |
|
| 134 |
$this->_error('Set Parameter: Unknown parameter '.$v.' for option '.$k);
|
|
| 135 |
return false; |
|
| 136 |
} |
|
| 137 |
break; |
|
| 138 |
case 'overlong': |
|
| 139 |
$this->_allow_overlong = ($v) ? true : false; |
|
| 140 |
break; |
|
| 141 |
case 'strict': |
|
| 142 |
$this->_strict_mode = ($v) ? true : false; |
|
| 143 |
break; |
|
| 144 |
case 'encode_german_sz': |
|
| 145 |
$this->_encode_german_sz = ($v) ? true : false; |
|
| 146 |
break; |
|
| 147 |
default: |
|
| 148 |
$this->_error('Set Parameter: Unknown option '.$k);
|
|
| 149 |
return false; |
|
| 150 |
} |
|
| 151 |
} |
|
| 152 |
return true; |
|
| 153 |
} |
|
| 154 |
|
|
| 155 |
/** |
|
| 156 |
* Decode a given ACE domain name |
|
| 157 |
* @param string Domain name (ACE string) |
|
| 158 |
* [@param string Desired output encoding, see {@link set_parameter}]
|
|
| 159 |
* @return string Decoded Domain name (UTF-8 or UCS-4) |
|
| 160 |
*/ |
|
| 161 |
public function decode($input, $one_time_encoding = false) |
|
| 162 |
{
|
|
| 163 |
// Optionally set |
|
| 164 |
if ($one_time_encoding) {
|
|
| 165 |
switch ($one_time_encoding) {
|
|
| 166 |
case 'utf8': |
|
| 167 |
case 'ucs4_string': |
|
| 168 |
case 'ucs4_array': |
|
| 169 |
break; |
|
| 170 |
default: |
|
| 171 |
$this->_error('Unknown encoding '.$one_time_encoding);
|
|
| 172 |
return false; |
|
| 173 |
} |
|
| 174 |
} |
|
| 175 |
// Make sure to drop any newline characters around |
|
| 176 |
$input = trim($input); |
|
| 177 |
|
|
| 178 |
// Negotiate input and try to determine, whether it is a plain string, |
|
| 179 |
// an email address or something like a complete URL |
|
| 180 |
if (strpos($input, '@')) { // Maybe it is an email address
|
|
| 181 |
// No no in strict mode |
|
| 182 |
if ($this->_strict_mode) {
|
|
| 183 |
$this->_error('Only simple domain name parts can be handled in strict mode');
|
|
| 184 |
return false; |
|
| 185 |
} |
|
| 186 |
list ($email_pref, $input) = explode('@', $input, 2);
|
|
| 187 |
$arr = explode('.', $input);
|
|
| 188 |
foreach ($arr as $k => $v) {
|
|
| 189 |
if (preg_match('!^'.preg_quote($this->_punycode_prefix, '!').'!', $v)) {
|
|
| 190 |
$conv = $this->_decode($v); |
|
| 191 |
if ($conv) $arr[$k] = $conv; |
|
| 192 |
} |
|
| 193 |
} |
|
| 194 |
$input = join('.', $arr);
|
|
| 195 |
$arr = explode('.', $email_pref);
|
|
| 196 |
foreach ($arr as $k => $v) {
|
|
| 197 |
if (preg_match('!^'.preg_quote($this->_punycode_prefix, '!').'!', $v)) {
|
|
| 198 |
$conv = $this->_decode($v); |
|
| 199 |
if ($conv) $arr[$k] = $conv; |
|
| 200 |
} |
|
| 201 |
} |
|
| 202 |
$email_pref = join('.', $arr);
|
|
| 203 |
$return = $email_pref . '@' . $input; |
|
| 204 |
} elseif (preg_match('![:\./]!', $input)) { // Or a complete domain name (with or without paths / parameters)
|
|
| 205 |
// No no in strict mode |
|
| 206 |
if ($this->_strict_mode) {
|
|
| 207 |
$this->_error('Only simple domain name parts can be handled in strict mode');
|
|
| 208 |
return false; |
|
| 209 |
} |
|
| 210 |
$parsed = parse_url($input); |
|
| 211 |
if (isset($parsed['host'])) {
|
|
| 212 |
$arr = explode('.', $parsed['host']);
|
|
| 213 |
foreach ($arr as $k => $v) {
|
|
| 214 |
$conv = $this->_decode($v); |
|
| 215 |
if ($conv) $arr[$k] = $conv; |
|
| 216 |
} |
|
| 217 |
$parsed['host'] = join('.', $arr);
|
|
| 218 |
$return = |
|
| 219 |
(empty($parsed['scheme']) ? '' : $parsed['scheme'].(strtolower($parsed['scheme']) == 'mailto' ? ':' : '://')) |
|
| 220 |
.(empty($parsed['user']) ? '' : $parsed['user'].(empty($parsed['pass']) ? '' : ':'.$parsed['pass']).'@') |
|
| 221 |
.$parsed['host'] |
|
| 222 |
.(empty($parsed['port']) ? '' : ':'.$parsed['port']) |
|
| 223 |
.(empty($parsed['path']) ? '' : $parsed['path']) |
|
| 224 |
.(empty($parsed['query']) ? '' : '?'.$parsed['query']) |
|
| 225 |
.(empty($parsed['fragment']) ? '' : '#'.$parsed['fragment']); |
|
| 226 |
} else { // parse_url seems to have failed, try without it
|
|
| 227 |
$arr = explode('.', $input);
|
|
| 228 |
foreach ($arr as $k => $v) {
|
|
| 229 |
$conv = $this->_decode($v); |
|
| 230 |
$arr[$k] = ($conv) ? $conv : $v; |
|
| 231 |
} |
|
| 232 |
$return = join('.', $arr);
|
|
| 233 |
} |
|
| 234 |
} else { // Otherwise we consider it being a pure domain name string
|
|
| 235 |
$return = $this->_decode($input); |
|
| 236 |
if (!$return) $return = $input; |
|
| 237 |
} |
|
| 238 |
// The output is UTF-8 by default, other output formats need conversion here |
|
| 239 |
// If one time encoding is given, use this, else the objects property |
|
| 240 |
switch (($one_time_encoding) ? $one_time_encoding : $this->_api_encoding) {
|
|
| 241 |
case 'utf8': |
|
| 242 |
return $return; |
|
| 243 |
break; |
|
| 244 |
case 'ucs4_string': |
|
| 245 |
return $this->_ucs4_to_ucs4_string($this->_utf8_to_ucs4($return)); |
|
| 246 |
break; |
|
| 247 |
case 'ucs4_array': |
|
| 248 |
return $this->_utf8_to_ucs4($return); |
|
| 249 |
break; |
|
| 250 |
default: |
|
| 251 |
$this->_error('Unsupported output format');
|
|
| 252 |
return false; |
|
| 253 |
} |
|
| 254 |
} |
|
| 255 |
|
|
| 256 |
/** |
|
| 257 |
* Encode a given UTF-8 domain name |
|
| 258 |
* @param string Domain name (UTF-8 or UCS-4) |
|
| 259 |
* [@param string Desired input encoding, see {@link set_parameter}]
|
|
| 260 |
* @return string Encoded Domain name (ACE string) |
|
| 261 |
*/ |
|
| 262 |
public function encode($decoded, $one_time_encoding = false) |
|
| 263 |
{
|
|
| 264 |
// Forcing conversion of input to UCS4 array |
|
| 265 |
// If one time encoding is given, use this, else the objects property |
|
| 266 |
switch ($one_time_encoding ? $one_time_encoding : $this->_api_encoding) {
|
|
| 267 |
case 'utf8': |
|
| 268 |
$decoded = $this->_utf8_to_ucs4($decoded); |
|
| 269 |
break; |
|
| 270 |
case 'ucs4_string': |
|
| 271 |
$decoded = $this->_ucs4_string_to_ucs4($decoded); |
|
| 272 |
case 'ucs4_array': |
|
| 273 |
break; |
|
| 274 |
default: |
|
| 275 |
$this->_error('Unsupported input format: '.($one_time_encoding ? $one_time_encoding : $this->_api_encoding));
|
|
| 276 |
return false; |
|
| 277 |
} |
|
| 278 |
|
|
| 279 |
// No input, no output, what else did you expect? |
|
| 280 |
if (empty($decoded)) return ''; |
|
| 281 |
|
|
| 282 |
// Anchors for iteration |
|
| 283 |
$last_begin = 0; |
|
| 284 |
// Output string |
|
| 285 |
$output = ''; |
|
| 286 |
foreach ($decoded as $k => $v) {
|
|
| 287 |
// Make sure to use just the plain dot |
|
| 288 |
switch($v) {
|
|
| 289 |
case 0x3002: |
|
| 290 |
case 0xFF0E: |
|
| 291 |
case 0xFF61: |
|
| 292 |
$decoded[$k] = 0x2E; |
|
| 293 |
// Right, no break here, the above are converted to dots anyway |
|
| 294 |
// Stumbling across an anchoring character |
|
| 295 |
case 0x2E: |
|
| 296 |
case 0x2F: |
|
| 297 |
case 0x3A: |
|
| 298 |
case 0x3F: |
|
| 299 |
case 0x40: |
|
| 300 |
// Neither email addresses nor URLs allowed in strict mode |
|
| 301 |
if ($this->_strict_mode) {
|
|
| 302 |
$this->_error('Neither email addresses nor URLs are allowed in strict mode.');
|
|
| 303 |
return false; |
|
| 304 |
} else {
|
|
| 305 |
// Skip first char |
|
| 306 |
if ($k) {
|
|
| 307 |
$encoded = ''; |
|
| 308 |
$encoded = $this->_encode(array_slice($decoded, $last_begin, (($k)-$last_begin))); |
|
| 309 |
if ($encoded) {
|
|
| 310 |
$output .= $encoded; |
|
| 311 |
} else {
|
|
| 312 |
$output .= $this->_ucs4_to_utf8(array_slice($decoded, $last_begin, (($k)-$last_begin))); |
|
| 313 |
} |
|
| 314 |
$output .= chr($decoded[$k]); |
|
| 315 |
} |
|
| 316 |
$last_begin = $k + 1; |
|
| 317 |
} |
|
| 318 |
} |
|
| 319 |
} |
|
| 320 |
// Catch the rest of the string |
|
| 321 |
if ($last_begin) {
|
|
| 322 |
$inp_len = sizeof($decoded); |
|
| 323 |
$encoded = ''; |
|
| 324 |
$encoded = $this->_encode(array_slice($decoded, $last_begin, (($inp_len)-$last_begin))); |
|
| 325 |
if ($encoded) {
|
|
| 326 |
$output .= $encoded; |
|
| 327 |
} else {
|
|
| 328 |
$output .= $this->_ucs4_to_utf8(array_slice($decoded, $last_begin, (($inp_len)-$last_begin))); |
|
| 329 |
} |
|
| 330 |
return $output; |
|
| 331 |
} else {
|
|
| 332 |
if ($output = $this->_encode($decoded)) {
|
|
| 333 |
return $output; |
|
| 334 |
} else {
|
|
| 335 |
return $this->_ucs4_to_utf8($decoded); |
|
| 336 |
} |
|
| 337 |
} |
|
| 338 |
} |
|
| 339 |
|
|
| 340 |
/** |
|
| 341 |
* Removes a weakness of encode(), which cannot properly handle URIs but instead encodes their |
|
| 342 |
* path or query components, too. |
|
| 343 |
* @param string $uri Expects the URI as a UTF-8 (or ASCII) string |
|
| 344 |
* @return string The URI encoded to Punycode, everything but the host component is left alone |
|
| 345 |
* @since 0.6.4 |
|
| 346 |
*/ |
|
| 347 |
public function encode_uri($uri) |
|
| 348 |
{
|
|
| 349 |
$parsed = parse_url($uri); |
|
| 350 |
if (!isset($parsed['host'])) {
|
|
| 351 |
$this->_error('The given string does not look like a URI');
|
|
| 352 |
return false; |
|
| 353 |
} |
|
| 354 |
$arr = explode('.', $parsed['host']);
|
|
| 355 |
foreach ($arr as $k => $v) {
|
|
| 356 |
$conv = $this->encode($v, 'utf8'); |
|
| 357 |
if ($conv) $arr[$k] = $conv; |
|
| 358 |
} |
|
| 359 |
$parsed['host'] = join('.', $arr);
|
|
| 360 |
$return = |
|
| 361 |
(empty($parsed['scheme']) ? '' : $parsed['scheme'].(strtolower($parsed['scheme']) == 'mailto' ? ':' : '://')) |
|
| 362 |
.(empty($parsed['user']) ? '' : $parsed['user'].(empty($parsed['pass']) ? '' : ':'.$parsed['pass']).'@') |
|
| 363 |
.$parsed['host'] |
|
| 364 |
.(empty($parsed['port']) ? '' : ':'.$parsed['port']) |
|
| 365 |
.(empty($parsed['path']) ? '' : $parsed['path']) |
|
| 366 |
.(empty($parsed['query']) ? '' : '?'.$parsed['query']) |
|
| 367 |
.(empty($parsed['fragment']) ? '' : '#'.$parsed['fragment']); |
|
| 368 |
return $return; |
|
| 369 |
} |
|
| 370 |
|
|
| 371 |
/** |
|
| 372 |
* Use this method to get the last error ocurred |
|
| 373 |
* @param void |
|
| 374 |
* @return string The last error, that occured |
|
| 375 |
*/ |
|
| 376 |
public function get_last_error() |
|
| 377 |
{
|
|
| 378 |
return $this->_error; |
|
| 379 |
} |
|
| 380 |
|
|
| 381 |
/** |
|
| 382 |
* The actual decoding algorithm |
|
| 383 |
* @param string |
|
| 384 |
* @return mixed |
|
| 385 |
*/ |
|
| 386 |
protected function _decode($encoded) |
|
| 387 |
{
|
|
| 388 |
$decoded = array(); |
|
| 389 |
// find the Punycode prefix |
|
| 390 |
if (!preg_match('!^'.preg_quote($this->_punycode_prefix, '!').'!', $encoded)) {
|
|
| 391 |
$this->_error('This is not a punycode string');
|
|
| 392 |
return false; |
|
| 393 |
} |
|
| 394 |
$encode_test = preg_replace('!^'.preg_quote($this->_punycode_prefix, '!').'!', '', $encoded);
|
|
| 395 |
// If nothing left after removing the prefix, it is hopeless |
|
| 396 |
if (!$encode_test) {
|
|
| 397 |
$this->_error('The given encoded string was empty');
|
|
| 398 |
return false; |
|
| 399 |
} |
|
| 400 |
// Find last occurence of the delimiter |
|
| 401 |
$delim_pos = strrpos($encoded, '-'); |
|
| 402 |
if ($delim_pos > strlen($this->_punycode_prefix)) {
|
|
| 403 |
for ($k = strlen($this->_punycode_prefix); $k < $delim_pos; ++$k) {
|
|
| 404 |
$decoded[] = ord($encoded{$k});
|
|
| 405 |
} |
|
| 406 |
} |
|
| 407 |
$deco_len = count($decoded); |
|
| 408 |
$enco_len = strlen($encoded); |
|
| 409 |
|
|
| 410 |
// Wandering through the strings; init |
|
| 411 |
$is_first = true; |
|
| 412 |
$bias = $this->_initial_bias; |
|
| 413 |
$idx = 0; |
|
| 414 |
$char = $this->_initial_n; |
|
| 415 |
|
|
| 416 |
for ($enco_idx = ($delim_pos) ? ($delim_pos + 1) : 0; $enco_idx < $enco_len; ++$deco_len) {
|
|
| 417 |
for ($old_idx = $idx, $w = 1, $k = $this->_base; 1 ; $k += $this->_base) {
|
|
| 418 |
$digit = $this->_decode_digit($encoded{$enco_idx++});
|
|
| 419 |
$idx += $digit * $w; |
|
| 420 |
$t = ($k <= $bias) ? $this->_tmin : |
|
| 421 |
(($k >= $bias + $this->_tmax) ? $this->_tmax : ($k - $bias)); |
|
| 422 |
if ($digit < $t) break; |
|
| 423 |
$w = (int) ($w * ($this->_base - $t)); |
|
| 424 |
} |
|
| 425 |
$bias = $this->_adapt($idx - $old_idx, $deco_len + 1, $is_first); |
|
| 426 |
$is_first = false; |
|
| 427 |
$char += (int) ($idx / ($deco_len + 1)); |
|
| 428 |
$idx %= ($deco_len + 1); |
|
| 429 |
if ($deco_len > 0) {
|
|
| 430 |
// Make room for the decoded char |
|
| 431 |
for ($i = $deco_len; $i > $idx; $i--) $decoded[$i] = $decoded[($i - 1)]; |
|
| 432 |
} |
|
| 433 |
$decoded[$idx++] = $char; |
|
| 434 |
} |
|
| 435 |
return $this->_ucs4_to_utf8($decoded); |
|
| 436 |
} |
|
| 437 |
|
|
| 438 |
/** |
|
| 439 |
* The actual encoding algorithm |
|
| 440 |
* @param string |
|
| 441 |
* @return mixed |
|
| 442 |
*/ |
|
| 443 |
protected function _encode($decoded) |
|
| 444 |
{
|
|
| 445 |
// We cannot encode a domain name containing the Punycode prefix |
|
| 446 |
$extract = strlen($this->_punycode_prefix); |
|
| 447 |
$check_pref = $this->_utf8_to_ucs4($this->_punycode_prefix); |
|
| 448 |
$check_deco = array_slice($decoded, 0, $extract); |
|
| 449 |
|
|
| 450 |
if ($check_pref == $check_deco) {
|
|
| 451 |
$this->_error('This is already a punycode string');
|
|
| 452 |
return false; |
|
| 453 |
} |
|
| 454 |
// We will not try to encode strings consisting of basic code points only |
|
| 455 |
$encodable = false; |
|
| 456 |
foreach ($decoded as $k => $v) {
|
|
| 457 |
if ($v > 0x7a) {
|
|
| 458 |
$encodable = true; |
|
| 459 |
break; |
|
| 460 |
} |
|
| 461 |
} |
|
| 462 |
if (!$encodable) {
|
|
| 463 |
$this->_error('The given string does not contain encodable chars');
|
|
| 464 |
return false; |
|
| 465 |
} |
|
| 466 |
// Do NAMEPREP |
|
| 467 |
$decoded = $this->_nameprep($decoded); |
|
| 468 |
if (!$decoded || !is_array($decoded)) return false; // NAMEPREP failed |
|
| 469 |
$deco_len = count($decoded); |
|
| 470 |
if (!$deco_len) return false; // Empty array |
|
| 471 |
$codecount = 0; // How many chars have been consumed |
|
| 472 |
$encoded = ''; |
|
| 473 |
// Copy all basic code points to output |
|
| 474 |
for ($i = 0; $i < $deco_len; ++$i) {
|
|
| 475 |
$test = $decoded[$i]; |
|
| 476 |
// Will match [-0-9a-zA-Z] |
|
| 477 |
if ((0x2F < $test && $test < 0x40) || (0x40 < $test && $test < 0x5B) |
|
| 478 |
|| (0x60 < $test && $test <= 0x7B) || (0x2D == $test)) {
|
|
| 479 |
$encoded .= chr($decoded[$i]); |
|
| 480 |
$codecount++; |
|
| 481 |
} |
|
| 482 |
} |
|
| 483 |
if ($codecount == $deco_len) return $encoded; // All codepoints were basic ones |
|
| 484 |
|
|
| 485 |
// Start with the prefix; copy it to output |
|
| 486 |
$encoded = $this->_punycode_prefix.$encoded; |
|
| 487 |
// If we have basic code points in output, add an hyphen to the end |
|
| 488 |
if ($codecount) $encoded .= '-'; |
|
| 489 |
// Now find and encode all non-basic code points |
|
| 490 |
$is_first = true; |
|
| 491 |
$cur_code = $this->_initial_n; |
|
| 492 |
$bias = $this->_initial_bias; |
|
| 493 |
$delta = 0; |
|
| 494 |
while ($codecount < $deco_len) {
|
|
| 495 |
// Find the smallest code point >= the current code point and |
|
| 496 |
// remember the last ouccrence of it in the input |
|
| 497 |
for ($i = 0, $next_code = $this->_max_ucs; $i < $deco_len; $i++) {
|
|
| 498 |
if ($decoded[$i] >= $cur_code && $decoded[$i] <= $next_code) {
|
|
| 499 |
$next_code = $decoded[$i]; |
|
| 500 |
} |
|
| 501 |
} |
|
| 502 |
$delta += ($next_code - $cur_code) * ($codecount + 1); |
|
| 503 |
$cur_code = $next_code; |
|
| 504 |
|
|
| 505 |
// Scan input again and encode all characters whose code point is $cur_code |
|
| 506 |
for ($i = 0; $i < $deco_len; $i++) {
|
|
| 507 |
if ($decoded[$i] < $cur_code) {
|
|
| 508 |
$delta++; |
|
| 509 |
} elseif ($decoded[$i] == $cur_code) {
|
|
| 510 |
for ($q = $delta, $k = $this->_base; 1; $k += $this->_base) {
|
|
| 511 |
$t = ($k <= $bias) ? $this->_tmin : |
|
| 512 |
(($k >= $bias + $this->_tmax) ? $this->_tmax : $k - $bias); |
|
| 513 |
if ($q < $t) break; |
|
| 514 |
$encoded .= $this->_encode_digit(intval($t + (($q - $t) % ($this->_base - $t)))); //v0.4.5 Changed from ceil() to intval() |
|
| 515 |
$q = (int) (($q - $t) / ($this->_base - $t)); |
|
| 516 |
} |
|
| 517 |
$encoded .= $this->_encode_digit($q); |
|
| 518 |
$bias = $this->_adapt($delta, $codecount+1, $is_first); |
|
| 519 |
$codecount++; |
|
| 520 |
$delta = 0; |
|
| 521 |
$is_first = false; |
|
| 522 |
} |
|
| 523 |
} |
|
| 524 |
$delta++; |
|
| 525 |
$cur_code++; |
|
| 526 |
} |
|
| 527 |
return $encoded; |
|
| 528 |
} |
|
| 529 |
|
|
| 530 |
/** |
|
| 531 |
* Adapt the bias according to the current code point and position |
|
| 532 |
* @param int $delta |
|
| 533 |
* @param int $npoints |
|
| 534 |
* @param int $is_first |
|
| 535 |
* @return int |
|
| 536 |
*/ |
|
| 537 |
protected function _adapt($delta, $npoints, $is_first) |
|
| 538 |
{
|
|
| 539 |
$delta = intval($is_first ? ($delta / $this->_damp) : ($delta / 2)); |
|
| 540 |
$delta += intval($delta / $npoints); |
|
| 541 |
for ($k = 0; $delta > (($this->_base - $this->_tmin) * $this->_tmax) / 2; $k += $this->_base) {
|
|
| 542 |
$delta = intval($delta / ($this->_base - $this->_tmin)); |
|
| 543 |
} |
|
| 544 |
return intval($k + ($this->_base - $this->_tmin + 1) * $delta / ($delta + $this->_skew)); |
|
| 545 |
} |
|
| 546 |
|
|
| 547 |
/** |
|
| 548 |
* Encoding a certain digit |
|
| 549 |
* @param int $d |
|
| 550 |
* @return string |
|
| 551 |
*/ |
|
| 552 |
protected function _encode_digit($d) |
|
| 553 |
{
|
|
| 554 |
return chr($d + 22 + 75 * ($d < 26)); |
|
| 555 |
} |
|
| 556 |
|
|
| 557 |
/** |
|
| 558 |
* Decode a certain digit |
|
| 559 |
* @param int $cp |
|
| 560 |
* @return int |
|
| 561 |
*/ |
|
| 562 |
protected function _decode_digit($cp) |
|
| 563 |
{
|
|
| 564 |
$cp = ord($cp); |
|
| 565 |
return ($cp - 48 < 10) ? $cp - 22 : (($cp - 65 < 26) ? $cp - 65 : (($cp - 97 < 26) ? $cp - 97 : $this->_base)); |
|
| 566 |
} |
|
| 567 |
|
|
| 568 |
/** |
|
| 569 |
* Internal error handling method |
|
| 570 |
* @param string $error |
|
| 571 |
*/ |
|
| 572 |
protected function _error($error = '') |
|
| 573 |
{
|
|
| 574 |
$this->_error = $error; |
|
| 575 |
} |
|
| 576 |
|
|
| 577 |
/** |
|
| 578 |
* Do Nameprep according to RFC3491 and RFC3454 |
|
| 579 |
* @param array Unicode Characters |
|
| 580 |
* @return string Unicode Characters, Nameprep'd |
|
| 581 |
*/ |
|
| 582 |
protected function _nameprep($input) |
|
| 583 |
{
|
|
| 584 |
$output = array(); |
|
| 585 |
$error = false; |
|
| 586 |
// |
|
| 587 |
// Mapping |
|
| 588 |
// Walking through the input array, performing the required steps on each of |
|
| 589 |
// the input chars and putting the result into the output array |
|
| 590 |
// While mapping required chars we apply the cannonical ordering |
|
| 591 |
foreach ($input as $v) {
|
|
| 592 |
// Map to nothing == skip that code point |
|
| 593 |
if (in_array($v, $this->NP['map_nothing'])) continue; |
|
| 594 |
// Try to find prohibited input |
|
| 595 |
if (in_array($v, $this->NP['prohibit']) || in_array($v, $this->NP['general_prohibited'])) {
|
|
| 596 |
$this->_error('NAMEPREP: Prohibited input U+'.sprintf('%08X', $v));
|
|
| 597 |
return false; |
|
| 598 |
} |
|
| 599 |
foreach ($this->NP['prohibit_ranges'] as $range) {
|
|
| 600 |
if ($range[0] <= $v && $v <= $range[1]) {
|
|
| 601 |
$this->_error('NAMEPREP: Prohibited input U+'.sprintf('%08X', $v));
|
|
| 602 |
return false; |
|
| 603 |
} |
|
| 604 |
} |
|
| 605 |
// Hangul syllable decomposition |
|
| 606 |
if (0xAC00 <= $v && $v <= 0xD7AF) {
|
|
| 607 |
foreach ($this->_hangul_decompose($v) as $out) $output[] = (int) $out; |
|
| 608 |
// There's a decomposition mapping for that code point |
|
| 609 |
} elseif (isset($this->NP['replacemaps'][$v])) {
|
|
| 610 |
foreach ($this->_apply_cannonical_ordering($this->NP['replacemaps'][$v]) as $out) {
|
|
| 611 |
$output[] = (int) $out; |
|
| 612 |
} |
|
| 613 |
} else {
|
|
| 614 |
$output[] = (int) $v; |
|
| 615 |
} |
|
| 616 |
} |
|
| 617 |
// Before applying any Combining, try to rearrange any Hangul syllables |
|
| 618 |
$output = $this->_hangul_compose($output); |
|
| 619 |
// |
|
| 620 |
// Combine code points |
|
| 621 |
// |
|
| 622 |
$last_class = 0; |
|
| 623 |
$last_starter = 0; |
|
| 624 |
$out_len = count($output); |
|
| 625 |
for ($i = 0; $i < $out_len; ++$i) {
|
|
| 626 |
$class = $this->_get_combining_class($output[$i]); |
|
| 627 |
if ((!$last_class || $last_class > $class) && $class) {
|
|
| 628 |
// Try to match |
|
| 629 |
$seq_len = $i - $last_starter; |
|
| 630 |
$out = $this->_combine(array_slice($output, $last_starter, $seq_len)); |
|
| 631 |
// On match: Replace the last starter with the composed character and remove |
|
| 632 |
// the now redundant non-starter(s) |
|
| 633 |
if ($out) {
|
|
| 634 |
$output[$last_starter] = $out; |
|
| 635 |
if (count($out) != $seq_len) {
|
|
| 636 |
for ($j = $i+1; $j < $out_len; ++$j) $output[$j-1] = $output[$j]; |
|
| 637 |
unset($output[$out_len]); |
|
| 638 |
} |
|
| 639 |
// Rewind the for loop by one, since there can be more possible compositions |
|
| 640 |
$i--; |
|
| 641 |
$out_len--; |
|
| 642 |
$last_class = ($i == $last_starter) ? 0 : $this->_get_combining_class($output[$i-1]); |
|
| 643 |
continue; |
|
| 644 |
} |
|
| 645 |
} |
|
| 646 |
// The current class is 0 |
|
| 647 |
if (!$class) $last_starter = $i; |
|
| 648 |
$last_class = $class; |
|
| 649 |
} |
|
| 650 |
return $output; |
|
| 651 |
} |
|
| 652 |
|
|
| 653 |
/** |
|
| 654 |
* Decomposes a Hangul syllable |
|
| 655 |
* (see http://www.unicode.org/unicode/reports/tr15/#Hangul |
|
| 656 |
* @param integer 32bit UCS4 code point |
|
| 657 |
* @return array Either Hangul Syllable decomposed or original 32bit value as one value array |
|
| 658 |
*/ |
|
| 659 |
protected function _hangul_decompose($char) |
|
| 660 |
{
|
|
| 661 |
$sindex = (int) $char - $this->_sbase; |
|
| 662 |
if ($sindex < 0 || $sindex >= $this->_scount) return array($char); |
|
| 663 |
$result = array(); |
|
| 664 |
$result[] = (int) $this->_lbase + $sindex / $this->_ncount; |
|
| 665 |
$result[] = (int) $this->_vbase + ($sindex % $this->_ncount) / $this->_tcount; |
|
| 666 |
$T = intval($this->_tbase + $sindex % $this->_tcount); |
|
| 667 |
if ($T != $this->_tbase) $result[] = $T; |
|
| 668 |
return $result; |
|
| 669 |
} |
|
| 670 |
/** |
|
| 671 |
* Ccomposes a Hangul syllable |
|
| 672 |
* (see http://www.unicode.org/unicode/reports/tr15/#Hangul |
|
| 673 |
* @param array Decomposed UCS4 sequence |
|
| 674 |
* @return array UCS4 sequence with syllables composed |
|
| 675 |
*/ |
|
| 676 |
protected function _hangul_compose($input) |
|
| 677 |
{
|
|
| 678 |
$inp_len = count($input); |
|
| 679 |
if (!$inp_len) return array(); |
|
| 680 |
$result = array(); |
|
| 681 |
$last = (int) $input[0]; |
|
| 682 |
$result[] = $last; // copy first char from input to output |
|
| 683 |
|
|
| 684 |
for ($i = 1; $i < $inp_len; ++$i) {
|
|
| 685 |
$char = (int) $input[$i]; |
|
| 686 |
$sindex = $last - $this->_sbase; |
|
| 687 |
$lindex = $last - $this->_lbase; |
|
| 688 |
$vindex = $char - $this->_vbase; |
|
| 689 |
$tindex = $char - $this->_tbase; |
|
| 690 |
// Find out, whether two current characters are LV and T |
|
| 691 |
if (0 <= $sindex && $sindex < $this->_scount && ($sindex % $this->_tcount == 0) |
|
| 692 |
&& 0 <= $tindex && $tindex <= $this->_tcount) {
|
|
| 693 |
// create syllable of form LVT |
|
| 694 |
$last += $tindex; |
|
| 695 |
$result[(count($result) - 1)] = $last; // reset last |
|
| 696 |
continue; // discard char |
|
| 697 |
} |
|
| 698 |
// Find out, whether two current characters form L and V |
|
| 699 |
if (0 <= $lindex && $lindex < $this->_lcount && 0 <= $vindex && $vindex < $this->_vcount) {
|
|
| 700 |
// create syllable of form LV |
|
| 701 |
$last = (int) $this->_sbase + ($lindex * $this->_vcount + $vindex) * $this->_tcount; |
|
| 702 |
$result[(count($result) - 1)] = $last; // reset last |
|
| 703 |
continue; // discard char |
|
| 704 |
} |
|
| 705 |
// if neither case was true, just add the character |
|
| 706 |
$last = $char; |
|
| 707 |
$result[] = $char; |
|
| 708 |
} |
|
| 709 |
return $result; |
|
| 710 |
} |
|
| 711 |
|
|
| 712 |
/** |
|
| 713 |
* Returns the combining class of a certain wide char |
|
| 714 |
* @param integer Wide char to check (32bit integer) |
|
| 715 |
* @return integer Combining class if found, else 0 |
|
| 716 |
*/ |
|
| 717 |
protected function _get_combining_class($char) |
|
| 718 |
{
|
|
| 719 |
return isset($this->NP['norm_combcls'][$char]) ? $this->NP['norm_combcls'][$char] : 0; |
|
| 720 |
} |
|
| 721 |
|
|
| 722 |
/** |
|
| 723 |
* Apllies the cannonical ordering of a decomposed UCS4 sequence |
|
| 724 |
* @param array Decomposed UCS4 sequence |
|
| 725 |
* @return array Ordered USC4 sequence |
|
| 726 |
*/ |
|
| 727 |
protected function _apply_cannonical_ordering($input) |
|
| 728 |
{
|
|
| 729 |
$swap = true; |
|
| 730 |
$size = count($input); |
|
| 731 |
while ($swap) {
|
|
| 732 |
$swap = false; |
|
| 733 |
$last = $this->_get_combining_class(intval($input[0])); |
|
| 734 |
for ($i = 0; $i < $size-1; ++$i) {
|
|
| 735 |
$next = $this->_get_combining_class(intval($input[$i+1])); |
|
| 736 |
if ($next != 0 && $last > $next) {
|
|
| 737 |
// Move item leftward until it fits |
|
| 738 |
for ($j = $i + 1; $j > 0; --$j) {
|
|
| 739 |
if ($this->_get_combining_class(intval($input[$j-1])) <= $next) break; |
|
| 740 |
$t = intval($input[$j]); |
|
| 741 |
$input[$j] = intval($input[$j-1]); |
|
| 742 |
$input[$j-1] = $t; |
|
| 743 |
$swap = true; |
|
| 744 |
} |
|
| 745 |
// Reentering the loop looking at the old character again |
|
| 746 |
$next = $last; |
|
| 747 |
} |
|
| 748 |
$last = $next; |
|
| 749 |
} |
|
| 750 |
} |
|
| 751 |
return $input; |
|
| 752 |
} |
|
| 753 |
|
|
| 754 |
/** |
|
| 755 |
* Do composition of a sequence of starter and non-starter |
|
| 756 |
* @param array UCS4 Decomposed sequence |
|
| 757 |
* @return array Ordered USC4 sequence |
|
| 758 |
*/ |
|
| 759 |
protected function _combine($input) |
|
| 760 |
{
|
|
| 761 |
$inp_len = count($input); |
|
| 762 |
foreach ($this->NP['replacemaps'] as $np_src => $np_target) {
|
|
| 763 |
if ($np_target[0] != $input[0]) continue; |
|
| 764 |
if (count($np_target) != $inp_len) continue; |
|
| 765 |
$hit = false; |
|
| 766 |
foreach ($input as $k2 => $v2) {
|
|
| 767 |
if ($v2 == $np_target[$k2]) {
|
|
| 768 |
$hit = true; |
|
| 769 |
} else {
|
|
| 770 |
$hit = false; |
|
| 771 |
break; |
|
| 772 |
} |
|
| 773 |
} |
|
| 774 |
if ($hit) return $np_src; |
|
| 775 |
} |
|
| 776 |
return false; |
|
| 777 |
} |
|
| 778 |
|
|
| 779 |
/** |
|
| 780 |
* This converts an UTF-8 encoded string to its UCS-4 representation |
|
| 781 |
* By talking about UCS-4 "strings" we mean arrays of 32bit integers representing |
|
| 782 |
* each of the "chars". This is due to PHP not being able to handle strings with |
|
| 783 |
* bit depth different from 8. This apllies to the reverse method _ucs4_to_utf8(), too. |
|
| 784 |
* The following UTF-8 encodings are supported: |
|
| 785 |
* bytes bits representation |
|
| 786 |
* 1 7 0xxxxxxx |
|
| 787 |
* 2 11 110xxxxx 10xxxxxx |
|
| 788 |
* 3 16 1110xxxx 10xxxxxx 10xxxxxx |
|
| 789 |
* 4 21 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
|
| 790 |
* 5 26 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
|
| 791 |
* 6 31 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
|
| 792 |
* Each x represents a bit that can be used to store character data. |
|
| 793 |
* The five and six byte sequences are part of Annex D of ISO/IEC 10646-1:2000 |
|
| 794 |
* @param string $input |
|
| 795 |
* @return string |
|
| 796 |
*/ |
|
| 797 |
protected function _utf8_to_ucs4($input) |
|
| 798 |
{
|
|
| 799 |
$output = array(); |
|
| 800 |
$out_len = 0; |
|
| 801 |
// Patch by Daniel Hahler; work around prolbem with mbstring.func_overload |
|
| 802 |
if (function_exists('mb_strlen')) {
|
|
| 803 |
$inp_len = mb_strlen($input, '8bit'); |
|
| 804 |
} else {
|
|
| 805 |
$inp_len = strlen($input); |
|
| 806 |
} |
|
| 807 |
$mode = 'next'; |
|
| 808 |
$test = 'none'; |
|
| 809 |
for ($k = 0; $k < $inp_len; ++$k) {
|
|
| 810 |
$v = ord($input{$k}); // Extract byte from input string
|
|
| 811 |
if ($v < 128) { // We found an ASCII char - put into stirng as is
|
|
| 812 |
$output[$out_len] = $v; |
|
| 813 |
++$out_len; |
|
| 814 |
if ('add' == $mode) {
|
|
| 815 |
$this->_error('Conversion from UTF-8 to UCS-4 failed: malformed input at byte '.$k);
|
|
| 816 |
return false; |
|
| 817 |
} |
|
| 818 |
continue; |
|
| 819 |
} |
|
| 820 |
if ('next' == $mode) { // Try to find the next start byte; determine the width of the Unicode char
|
|
| 821 |
$start_byte = $v; |
|
| 822 |
$mode = 'add'; |
|
| 823 |
$test = 'range'; |
|
| 824 |
if ($v >> 5 == 6) { // &110xxxxx 10xxxxx
|
|
| 825 |
$next_byte = 0; // Tells, how many times subsequent bitmasks must rotate 6bits to the left |
|
| 826 |
$v = ($v - 192) << 6; |
|
| 827 |
} elseif ($v >> 4 == 14) { // &1110xxxx 10xxxxxx 10xxxxxx
|
|
| 828 |
$next_byte = 1; |
|
| 829 |
$v = ($v - 224) << 12; |
|
| 830 |
} elseif ($v >> 3 == 30) { // &11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
|
|
| 831 |
$next_byte = 2; |
|
| 832 |
$v = ($v - 240) << 18; |
|
| 833 |
} elseif ($v >> 2 == 62) { // &111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
|
|
| 834 |
$next_byte = 3; |
|
| 835 |
$v = ($v - 248) << 24; |
|
| 836 |
} elseif ($v >> 1 == 126) { // &1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
|
|
| 837 |
$next_byte = 4; |
|
| 838 |
$v = ($v - 252) << 30; |
|
Also available in: Unified diff
fixed inclusion of SecureForm
added IDNA/Punycode to wb::validate_email()