Revision 1378
Added by Luisehahne almost 14 years ago
branches/2.8.x/CHANGELOG | ||
---|---|---|
11 | 11 |
! = Update/Change |
12 | 12 |
|
13 | 13 |
------------------------------------- 2.8.2 ------------------------------------- |
14 |
13 Jan-2011 Build 1378 Werner von den Decken (DarkViper) |
|
15 |
! fixed inclusion of SecureForm |
|
16 |
+ added IDNA/Punycode to wb::validate_email() |
|
14 | 17 |
11 Jan-2011 Build 1377 Frank Heyne (FrankH) |
15 | 18 |
# Security fix for modules jsadmin, menu_link and output_filter |
16 | 19 |
11 Jan-2011 Build 1376 Frank Heyne (FrankH) |
branches/2.8.x/wb/include/idna_convert/example.php | ||
---|---|---|
1 |
<?php |
|
2 |
$encoded = $decoded = $add = ''; |
|
3 |
header('Content-Type: text/html; charset=utf-8'); |
|
4 |
require_once('idna_convert.class.php'); |
|
5 |
$IDN = new idna_convert(); |
|
6 |
if (isset($_REQUEST['encode'])) { |
|
7 |
$decoded = isset($_REQUEST['decoded']) ? stripslashes($_REQUEST['decoded']) : ''; |
|
8 |
$encoded = $IDN->encode($decoded); |
|
9 |
} |
|
10 |
if (isset($_REQUEST['decode'])) { |
|
11 |
$encoded = isset($_REQUEST['encoded']) ? stripslashes($_REQUEST['encoded']) : ''; |
|
12 |
$decoded = $IDN->decode($encoded); |
|
13 |
} |
|
14 |
$lang = 'en'; |
|
15 |
if (isset($_REQUEST['lang'])) { |
|
16 |
if ('de' == $_REQUEST['lang'] || 'en' == $_REQUEST['lang']) $lang = $_REQUEST['lang']; |
|
17 |
$add .= '<input type="hidden" name="lang" value="'.$_REQUEST['lang'].'" />'."\n"; |
|
18 |
} |
|
19 |
?> |
|
20 |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
|
21 |
<html xmlns="http://www.w3.org/1999/xhtml"> |
|
22 |
<head> |
|
23 |
<title>phlyLabs Punycode Converter</title> |
|
24 |
<meta name="author" content="phlyLabs" /> |
|
25 |
<meta http-equiv="content-type" content="text/html; charset=utf-8" /> |
|
26 |
<style type="text/css"> |
|
27 |
/*<![CDATA[*/ |
|
28 |
body { color:black;background:white;font-size:10pt;font-family:Verdana,Helvetica,Sans-Serif; } |
|
29 |
body, form { margin:0; } |
|
30 |
form { display:inline; } |
|
31 |
input { font-size:8pt;font-family:Verdana,Helvetica,Sans-Serif; } |
|
32 |
#round { width:730px;padding:10px;background-color:rgb(230,230,240);border:1px solid black;text-align:center;vertical-align:middle;margin:auto;margin-top:50px; } |
|
33 |
th { font-size:9pt;font-weight:bold; } |
|
34 |
#copy { font-size:8pt;color:rgb(60,60,80); } |
|
35 |
#subhead { font-size:8pt; } |
|
36 |
#bla { font-size:8pt;text-align:left; } |
|
37 |
h5 {margin:0;font-size:11pt;font-weight:bold;} |
|
38 |
/*]]>*/ |
|
39 |
</style> |
|
40 |
</head> |
|
41 |
<body> |
|
42 |
<div id="round"> |
|
43 |
<h5>phlyLabs' pure PHP IDNA Converter</h5><br /> |
|
44 |
<span id="subhead"> |
|
45 |
See <a href="http://faqs.org/rfcs/rfc3490.html" title="IDNA" target="_blank">RFC3490</a>, |
|
46 |
<a href="http://faqs.org/rfcs/rfc3491.html" title="Nameprep, a Stringprep profile" target="_blank">RFC3491</a>, |
|
47 |
<a href="http://faqs.org/rfcs/rfc3492.html" title="Punycode" target="_blank">RFC3492</a> and |
|
48 |
<a href="http://faqs.org/rfcs/rfc3454.html" title="Stringprep" target="_blank">RFC3454</a><br /> |
|
49 |
</span> |
|
50 |
<br /> |
|
51 |
<div id="bla"><?php if ($lang == 'de') { ?> |
|
52 |
Dieser Konverter erlaubt die ?bersetzung von Domainnamen zwischen der Punycode- und der |
|
53 |
Unicode-Schreibweise.<br /> |
|
54 |
Geben Sie einfach den Domainnamen im entsprechend bezeichneten Feld ein und klicken Sie dann auf den darunter |
|
55 |
liegenden Button. Sie k?nnen einfache Domainnamen, komplette URLs (wie http://j?rgen-m?ller.de) |
|
56 |
oder Emailadressen eingeben.<br /> |
|
57 |
<br /> |
|
58 |
Stellen Sie aber sicher, dass Ihr Browser den Zeichensatz <strong>UTF-8</strong> unterst?tzt.<br /> |
|
59 |
<br /> |
|
60 |
Wenn Sie Interesse an der zugrundeliegenden PHP-Klasse haben, k?nnen Sie diese |
|
61 |
<a href="http://phlymail.com/de/downloads/idna/download/">hier herunterladen</a>.<br /> |
|
62 |
<br /> |
|
63 |
Diese Klasse wird ohne Garantie ihrer Funktionst?chtigkeit bereit gestellt. Nutzung auf eigene Gefahr.<br /> |
|
64 |
Um sicher zu stellen, dass eine Zeichenkette korrekt umgewandelt wurde, sollten Sie diese immer zur?ckwandeln |
|
65 |
und das Ergebnis mit Ihrer urspr?nglichen Eingabe vergleichen.<br /> |
|
66 |
<br /> |
|
67 |
Fehler und Probleme k?nnen Sie gern an <a href="mailto:team@phlymail.de">team@phlymail.de</a> senden.<br /> |
|
68 |
<?php } else { ?> |
|
69 |
This converter allows you to transfer domain names between the encoded (Punycode) notation |
|
70 |
and the decoded (UTF-8) notation.<br /> |
|
71 |
Just enter the domain name in the respective field and click on the button right below it to have |
|
72 |
it converted. Please note, that you might even enter complete domain names (like jürgen-müller.de) |
|
73 |
or a email addresses.<br /> |
|
74 |
<br /> |
|
75 |
Make sure, that your browser is capable of the <strong>UTF-8</strong> character encoding.<br /> |
|
76 |
<br /> |
|
77 |
For those of you interested in the PHP source of the underlying class, you might |
|
78 |
<a href="http://phlymail.com/en/downloads/idna/download/">download it here</a>.<br /> |
|
79 |
<br /> |
|
80 |
Please be aware, that this class is provided as is and without any liability. Use at your own risk.<br /> |
|
81 |
To ensure, that a certain string has been converted correctly, you should convert it both ways and compare the |
|
82 |
results.<br /> |
|
83 |
<br /> |
|
84 |
Please feel free to report bugs and problems to: <a href="mailto:team@phlymail.com">team@phlymail.com</a>.<br /> |
|
85 |
<?php } ?> |
|
86 |
<br /> |
|
87 |
</div> |
|
88 |
<table border="0" cellpadding="2" cellspacing="2" align="center"> |
|
89 |
<thead> |
|
90 |
<tr> |
|
91 |
<th align="left">Original (Unicode)</th> |
|
92 |
<th align="right">Punycode (ACE)</th> |
|
93 |
</tr> |
|
94 |
</thead> |
|
95 |
<tbody> |
|
96 |
<tr> |
|
97 |
<td align="right"> |
|
98 |
<form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="get"> |
|
99 |
<input type="text" name="decoded" value="<?php echo htmlentities($decoded, null, 'UTF-8'); ?>" size="48" maxlength="255" /><br /> |
|
100 |
<input type="submit" name="encode" value="Encode >>" /><?php echo $add; ?> |
|
101 |
</form> |
|
102 |
</td> |
|
103 |
<td align="left"> |
|
104 |
<form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="get"> |
|
105 |
<input type="text" name="encoded" value="<?php echo htmlentities($encoded, null, 'UTF-8'); ?>" size="48" maxlength="255" /><br /> |
|
106 |
<input type="submit" name="decode" value="<< Decode" /><?php echo $add; ?> |
|
107 |
</form> |
|
108 |
</td> |
|
109 |
</tr> |
|
110 |
</tbody> |
|
111 |
</table> |
|
112 |
<br /> |
|
113 |
<span id="copy">Version used: 0.6.9; © 2004-2010 phlyLabs Berlin; part of <a href="http://phlymail.com/">phlyMail</a></span> |
|
114 |
</div> |
|
115 |
</body> |
|
116 |
</html> |
|
0 | 117 |
branches/2.8.x/wb/include/idna_convert/LICENCE | ||
---|---|---|
1 |
GNU LESSER GENERAL PUBLIC LICENSE |
|
2 |
Version 2.1, February 1999 |
|
3 |
|
|
4 |
Copyright (C) 1991, 1999 Free Software Foundation, Inc. |
|
5 |
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
|
6 |
Everyone is permitted to copy and distribute verbatim copies |
|
7 |
of this license document, but changing it is not allowed. |
|
8 |
|
|
9 |
[This is the first released version of the Lesser GPL. It also counts |
|
10 |
as the successor of the GNU Library Public License, version 2, hence |
|
11 |
the version number 2.1.] |
|
12 |
|
|
13 |
Preamble |
|
14 |
|
|
15 |
The licenses for most software are designed to take away your |
|
16 |
freedom to share and change it. By contrast, the GNU General Public |
|
17 |
Licenses are intended to guarantee your freedom to share and change |
|
18 |
free software--to make sure the software is free for all its users. |
|
19 |
|
|
20 |
This license, the Lesser General Public License, applies to some |
|
21 |
specially designated software packages--typically libraries--of the |
|
22 |
Free Software Foundation and other authors who decide to use it. You |
|
23 |
can use it too, but we suggest you first think carefully about whether |
|
24 |
this license or the ordinary General Public License is the better |
|
25 |
strategy to use in any particular case, based on the explanations below. |
|
26 |
|
|
27 |
When we speak of free software, we are referring to freedom of use, |
|
28 |
not price. Our General Public Licenses are designed to make sure that |
|
29 |
you have the freedom to distribute copies of free software (and charge |
|
30 |
for this service if you wish); that you receive source code or can get |
|
31 |
it if you want it; that you can change the software and use pieces of |
|
32 |
it in new free programs; and that you are informed that you can do |
|
33 |
these things. |
|
34 |
|
|
35 |
To protect your rights, we need to make restrictions that forbid |
|
36 |
distributors to deny you these rights or to ask you to surrender these |
|
37 |
rights. These restrictions translate to certain responsibilities for |
|
38 |
you if you distribute copies of the library or if you modify it. |
|
39 |
|
|
40 |
For example, if you distribute copies of the library, whether gratis |
|
41 |
or for a fee, you must give the recipients all the rights that we gave |
|
42 |
you. You must make sure that they, too, receive or can get the source |
|
43 |
code. If you link other code with the library, you must provide |
|
44 |
complete object files to the recipients, so that they can relink them |
|
45 |
with the library after making changes to the library and recompiling |
|
46 |
it. And you must show them these terms so they know their rights. |
|
47 |
|
|
48 |
We protect your rights with a two-step method: (1) we copyright the |
|
49 |
library, and (2) we offer you this license, which gives you legal |
|
50 |
permission to copy, distribute and/or modify the library. |
|
51 |
|
|
52 |
To protect each distributor, we want to make it very clear that |
|
53 |
there is no warranty for the free library. Also, if the library is |
|
54 |
modified by someone else and passed on, the recipients should know |
|
55 |
that what they have is not the original version, so that the original |
|
56 |
author's reputation will not be affected by problems that might be |
|
57 |
introduced by others. |
|
58 |
|
|
59 |
Finally, software patents pose a constant threat to the existence of |
|
60 |
any free program. We wish to make sure that a company cannot |
|
61 |
effectively restrict the users of a free program by obtaining a |
|
62 |
restrictive license from a patent holder. Therefore, we insist that |
|
63 |
any patent license obtained for a version of the library must be |
|
64 |
consistent with the full freedom of use specified in this license. |
|
65 |
|
|
66 |
Most GNU software, including some libraries, is covered by the |
|
67 |
ordinary GNU General Public License. This license, the GNU Lesser |
|
68 |
General Public License, applies to certain designated libraries, and |
|
69 |
is quite different from the ordinary General Public License. We use |
|
70 |
this license for certain libraries in order to permit linking those |
|
71 |
libraries into non-free programs. |
|
72 |
|
|
73 |
When a program is linked with a library, whether statically or using |
|
74 |
a shared library, the combination of the two is legally speaking a |
|
75 |
combined work, a derivative of the original library. The ordinary |
|
76 |
General Public License therefore permits such linking only if the |
|
77 |
entire combination fits its criteria of freedom. The Lesser General |
|
78 |
Public License permits more lax criteria for linking other code with |
|
79 |
the library. |
|
80 |
|
|
81 |
We call this license the "Lesser" General Public License because it |
|
82 |
does Less to protect the user's freedom than the ordinary General |
|
83 |
Public License. It also provides other free software developers Less |
|
84 |
of an advantage over competing non-free programs. These disadvantages |
|
85 |
are the reason we use the ordinary General Public License for many |
|
86 |
libraries. However, the Lesser license provides advantages in certain |
|
87 |
special circumstances. |
|
88 |
|
|
89 |
For example, on rare occasions, there may be a special need to |
|
90 |
encourage the widest possible use of a certain library, so that it becomes |
|
91 |
a de-facto standard. To achieve this, non-free programs must be |
|
92 |
allowed to use the library. A more frequent case is that a free |
|
93 |
library does the same job as widely used non-free libraries. In this |
|
94 |
case, there is little to gain by limiting the free library to free |
|
95 |
software only, so we use the Lesser General Public License. |
|
96 |
|
|
97 |
In other cases, permission to use a particular library in non-free |
|
98 |
programs enables a greater number of people to use a large body of |
|
99 |
free software. For example, permission to use the GNU C Library in |
|
100 |
non-free programs enables many more people to use the whole GNU |
|
101 |
operating system, as well as its variant, the GNU/Linux operating |
|
102 |
system. |
|
103 |
|
|
104 |
Although the Lesser General Public License is Less protective of the |
|
105 |
users' freedom, it does ensure that the user of a program that is |
|
106 |
linked with the Library has the freedom and the wherewithal to run |
|
107 |
that program using a modified version of the Library. |
|
108 |
|
|
109 |
The precise terms and conditions for copying, distribution and |
|
110 |
modification follow. Pay close attention to the difference between a |
|
111 |
"work based on the library" and a "work that uses the library". The |
|
112 |
former contains code derived from the library, whereas the latter must |
|
113 |
be combined with the library in order to run. |
|
114 |
|
|
115 |
GNU LESSER GENERAL PUBLIC LICENSE |
|
116 |
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION |
|
117 |
|
|
118 |
0. This License Agreement applies to any software library or other |
|
119 |
program which contains a notice placed by the copyright holder or |
|
120 |
other authorized party saying it may be distributed under the terms of |
|
121 |
this Lesser General Public License (also called "this License"). |
|
122 |
Each licensee is addressed as "you". |
|
123 |
|
|
124 |
A "library" means a collection of software functions and/or data |
|
125 |
prepared so as to be conveniently linked with application programs |
|
126 |
(which use some of those functions and data) to form executables. |
|
127 |
|
|
128 |
The "Library", below, refers to any such software library or work |
|
129 |
which has been distributed under these terms. A "work based on the |
|
130 |
Library" means either the Library or any derivative work under |
|
131 |
copyright law: that is to say, a work containing the Library or a |
|
132 |
portion of it, either verbatim or with modifications and/or translated |
|
133 |
straightforwardly into another language. (Hereinafter, translation is |
|
134 |
included without limitation in the term "modification".) |
|
135 |
|
|
136 |
"Source code" for a work means the preferred form of the work for |
|
137 |
making modifications to it. For a library, complete source code means |
|
138 |
all the source code for all modules it contains, plus any associated |
|
139 |
interface definition files, plus the scripts used to control compilation |
|
140 |
and installation of the library. |
|
141 |
|
|
142 |
Activities other than copying, distribution and modification are not |
|
143 |
covered by this License; they are outside its scope. The act of |
|
144 |
running a program using the Library is not restricted, and output from |
|
145 |
such a program is covered only if its contents constitute a work based |
|
146 |
on the Library (independent of the use of the Library in a tool for |
|
147 |
writing it). Whether that is true depends on what the Library does |
|
148 |
and what the program that uses the Library does. |
|
149 |
|
|
150 |
1. You may copy and distribute verbatim copies of the Library's |
|
151 |
complete source code as you receive it, in any medium, provided that |
|
152 |
you conspicuously and appropriately publish on each copy an |
|
153 |
appropriate copyright notice and disclaimer of warranty; keep intact |
|
154 |
all the notices that refer to this License and to the absence of any |
|
155 |
warranty; and distribute a copy of this License along with the |
|
156 |
Library. |
|
157 |
|
|
158 |
You may charge a fee for the physical act of transferring a copy, |
|
159 |
and you may at your option offer warranty protection in exchange for a |
|
160 |
fee. |
|
161 |
|
|
162 |
2. You may modify your copy or copies of the Library or any portion |
|
163 |
of it, thus forming a work based on the Library, and copy and |
|
164 |
distribute such modifications or work under the terms of Section 1 |
|
165 |
above, provided that you also meet all of these conditions: |
|
166 |
|
|
167 |
a) The modified work must itself be a software library. |
|
168 |
|
|
169 |
b) You must cause the files modified to carry prominent notices |
|
170 |
stating that you changed the files and the date of any change. |
|
171 |
|
|
172 |
c) You must cause the whole of the work to be licensed at no |
|
173 |
charge to all third parties under the terms of this License. |
|
174 |
|
|
175 |
d) If a facility in the modified Library refers to a function or a |
|
176 |
table of data to be supplied by an application program that uses |
|
177 |
the facility, other than as an argument passed when the facility |
|
178 |
is invoked, then you must make a good faith effort to ensure that, |
|
179 |
in the event an application does not supply such function or |
|
180 |
table, the facility still operates, and performs whatever part of |
|
181 |
its purpose remains meaningful. |
|
182 |
|
|
183 |
(For example, a function in a library to compute square roots has |
|
184 |
a purpose that is entirely well-defined independent of the |
|
185 |
application. Therefore, Subsection 2d requires that any |
|
186 |
application-supplied function or table used by this function must |
|
187 |
be optional: if the application does not supply it, the square |
|
188 |
root function must still compute square roots.) |
|
189 |
|
|
190 |
These requirements apply to the modified work as a whole. If |
|
191 |
identifiable sections of that work are not derived from the Library, |
|
192 |
and can be reasonably considered independent and separate works in |
|
193 |
themselves, then this License, and its terms, do not apply to those |
|
194 |
sections when you distribute them as separate works. But when you |
|
195 |
distribute the same sections as part of a whole which is a work based |
|
196 |
on the Library, the distribution of the whole must be on the terms of |
|
197 |
this License, whose permissions for other licensees extend to the |
|
198 |
entire whole, and thus to each and every part regardless of who wrote |
|
199 |
it. |
|
200 |
|
|
201 |
Thus, it is not the intent of this section to claim rights or contest |
|
202 |
your rights to work written entirely by you; rather, the intent is to |
|
203 |
exercise the right to control the distribution of derivative or |
|
204 |
collective works based on the Library. |
|
205 |
|
|
206 |
In addition, mere aggregation of another work not based on the Library |
|
207 |
with the Library (or with a work based on the Library) on a volume of |
|
208 |
a storage or distribution medium does not bring the other work under |
|
209 |
the scope of this License. |
|
210 |
|
|
211 |
3. You may opt to apply the terms of the ordinary GNU General Public |
|
212 |
License instead of this License to a given copy of the Library. To do |
|
213 |
this, you must alter all the notices that refer to this License, so |
|
214 |
that they refer to the ordinary GNU General Public License, version 2, |
|
215 |
instead of to this License. (If a newer version than version 2 of the |
|
216 |
ordinary GNU General Public License has appeared, then you can specify |
|
217 |
that version instead if you wish.) Do not make any other change in |
|
218 |
these notices. |
|
219 |
|
|
220 |
Once this change is made in a given copy, it is irreversible for |
|
221 |
that copy, so the ordinary GNU General Public License applies to all |
|
222 |
subsequent copies and derivative works made from that copy. |
|
223 |
|
|
224 |
This option is useful when you wish to copy part of the code of |
|
225 |
the Library into a program that is not a library. |
|
226 |
|
|
227 |
4. You may copy and distribute the Library (or a portion or |
|
228 |
derivative of it, under Section 2) in object code or executable form |
|
229 |
under the terms of Sections 1 and 2 above provided that you accompany |
|
230 |
it with the complete corresponding machine-readable source code, which |
|
231 |
must be distributed under the terms of Sections 1 and 2 above on a |
|
232 |
medium customarily used for software interchange. |
|
233 |
|
|
234 |
If distribution of object code is made by offering access to copy |
|
235 |
from a designated place, then offering equivalent access to copy the |
|
236 |
source code from the same place satisfies the requirement to |
|
237 |
distribute the source code, even though third parties are not |
|
238 |
compelled to copy the source along with the object code. |
|
239 |
|
|
240 |
5. A program that contains no derivative of any portion of the |
|
241 |
Library, but is designed to work with the Library by being compiled or |
|
242 |
linked with it, is called a "work that uses the Library". Such a |
|
243 |
work, in isolation, is not a derivative work of the Library, and |
|
244 |
therefore falls outside the scope of this License. |
|
245 |
|
|
246 |
However, linking a "work that uses the Library" with the Library |
|
247 |
creates an executable that is a derivative of the Library (because it |
|
248 |
contains portions of the Library), rather than a "work that uses the |
|
249 |
library". The executable is therefore covered by this License. |
|
250 |
Section 6 states terms for distribution of such executables. |
|
251 |
|
|
252 |
When a "work that uses the Library" uses material from a header file |
|
253 |
that is part of the Library, the object code for the work may be a |
|
254 |
derivative work of the Library even though the source code is not. |
|
255 |
Whether this is true is especially significant if the work can be |
|
256 |
linked without the Library, or if the work is itself a library. The |
|
257 |
threshold for this to be true is not precisely defined by law. |
|
258 |
|
|
259 |
If such an object file uses only numerical parameters, data |
|
260 |
structure layouts and accessors, and small macros and small inline |
|
261 |
functions (ten lines or less in length), then the use of the object |
|
262 |
file is unrestricted, regardless of whether it is legally a derivative |
|
263 |
work. (Executables containing this object code plus portions of the |
|
264 |
Library will still fall under Section 6.) |
|
265 |
|
|
266 |
Otherwise, if the work is a derivative of the Library, you may |
|
267 |
distribute the object code for the work under the terms of Section 6. |
|
268 |
Any executables containing that work also fall under Section 6, |
|
269 |
whether or not they are linked directly with the Library itself. |
|
270 |
|
|
271 |
6. As an exception to the Sections above, you may also combine or |
|
272 |
link a "work that uses the Library" with the Library to produce a |
|
273 |
work containing portions of the Library, and distribute that work |
|
274 |
under terms of your choice, provided that the terms permit |
|
275 |
modification of the work for the customer's own use and reverse |
|
276 |
engineering for debugging such modifications. |
|
277 |
|
|
278 |
You must give prominent notice with each copy of the work that the |
|
279 |
Library is used in it and that the Library and its use are covered by |
|
280 |
this License. You must supply a copy of this License. If the work |
|
281 |
during execution displays copyright notices, you must include the |
|
282 |
copyright notice for the Library among them, as well as a reference |
|
283 |
directing the user to the copy of this License. Also, you must do one |
|
284 |
of these things: |
|
285 |
|
|
286 |
a) Accompany the work with the complete corresponding |
|
287 |
machine-readable source code for the Library including whatever |
|
288 |
changes were used in the work (which must be distributed under |
|
289 |
Sections 1 and 2 above); and, if the work is an executable linked |
|
290 |
with the Library, with the complete machine-readable "work that |
|
291 |
uses the Library", as object code and/or source code, so that the |
|
292 |
user can modify the Library and then relink to produce a modified |
|
293 |
executable containing the modified Library. (It is understood |
|
294 |
that the user who changes the contents of definitions files in the |
|
295 |
Library will not necessarily be able to recompile the application |
|
296 |
to use the modified definitions.) |
|
297 |
|
|
298 |
b) Use a suitable shared library mechanism for linking with the |
|
299 |
Library. A suitable mechanism is one that (1) uses at run time a |
|
300 |
copy of the library already present on the user's computer system, |
|
301 |
rather than copying library functions into the executable, and (2) |
|
302 |
will operate properly with a modified version of the library, if |
|
303 |
the user installs one, as long as the modified version is |
|
304 |
interface-compatible with the version that the work was made with. |
|
305 |
|
|
306 |
c) Accompany the work with a written offer, valid for at |
|
307 |
least three years, to give the same user the materials |
|
308 |
specified in Subsection 6a, above, for a charge no more |
|
309 |
than the cost of performing this distribution. |
|
310 |
|
|
311 |
d) If distribution of the work is made by offering access to copy |
|
312 |
from a designated place, offer equivalent access to copy the above |
|
313 |
specified materials from the same place. |
|
314 |
|
|
315 |
e) Verify that the user has already received a copy of these |
|
316 |
materials or that you have already sent this user a copy. |
|
317 |
|
|
318 |
For an executable, the required form of the "work that uses the |
|
319 |
Library" must include any data and utility programs needed for |
|
320 |
reproducing the executable from it. However, as a special exception, |
|
321 |
the materials to be distributed need not include anything that is |
|
322 |
normally distributed (in either source or binary form) with the major |
|
323 |
components (compiler, kernel, and so on) of the operating system on |
|
324 |
which the executable runs, unless that component itself accompanies |
|
325 |
the executable. |
|
326 |
|
|
327 |
It may happen that this requirement contradicts the license |
|
328 |
restrictions of other proprietary libraries that do not normally |
|
329 |
accompany the operating system. Such a contradiction means you cannot |
|
330 |
use both them and the Library together in an executable that you |
|
331 |
distribute. |
|
332 |
|
|
333 |
7. You may place library facilities that are a work based on the |
|
334 |
Library side-by-side in a single library together with other library |
|
335 |
facilities not covered by this License, and distribute such a combined |
|
336 |
library, provided that the separate distribution of the work based on |
|
337 |
the Library and of the other library facilities is otherwise |
|
338 |
permitted, and provided that you do these two things: |
|
339 |
|
|
340 |
a) Accompany the combined library with a copy of the same work |
|
341 |
based on the Library, uncombined with any other library |
|
342 |
facilities. This must be distributed under the terms of the |
|
343 |
Sections above. |
|
344 |
|
|
345 |
b) Give prominent notice with the combined library of the fact |
|
346 |
that part of it is a work based on the Library, and explaining |
|
347 |
where to find the accompanying uncombined form of the same work. |
|
348 |
|
|
349 |
8. You may not copy, modify, sublicense, link with, or distribute |
|
350 |
the Library except as expressly provided under this License. Any |
|
351 |
attempt otherwise to copy, modify, sublicense, link with, or |
|
352 |
distribute the Library is void, and will automatically terminate your |
|
353 |
rights under this License. However, parties who have received copies, |
|
354 |
or rights, from you under this License will not have their licenses |
|
355 |
terminated so long as such parties remain in full compliance. |
|
356 |
|
|
357 |
9. You are not required to accept this License, since you have not |
|
358 |
signed it. However, nothing else grants you permission to modify or |
|
359 |
distribute the Library or its derivative works. These actions are |
|
360 |
prohibited by law if you do not accept this License. Therefore, by |
|
361 |
modifying or distributing the Library (or any work based on the |
|
362 |
Library), you indicate your acceptance of this License to do so, and |
|
363 |
all its terms and conditions for copying, distributing or modifying |
|
364 |
the Library or works based on it. |
|
365 |
|
|
366 |
10. Each time you redistribute the Library (or any work based on the |
|
367 |
Library), the recipient automatically receives a license from the |
|
368 |
original licensor to copy, distribute, link with or modify the Library |
|
369 |
subject to these terms and conditions. You may not impose any further |
|
370 |
restrictions on the recipients' exercise of the rights granted herein. |
|
371 |
You are not responsible for enforcing compliance by third parties with |
|
372 |
this License. |
|
373 |
|
|
374 |
11. If, as a consequence of a court judgment or allegation of patent |
|
375 |
infringement or for any other reason (not limited to patent issues), |
|
376 |
conditions are imposed on you (whether by court order, agreement or |
|
377 |
otherwise) that contradict the conditions of this License, they do not |
|
378 |
excuse you from the conditions of this License. If you cannot |
|
379 |
distribute so as to satisfy simultaneously your obligations under this |
|
380 |
License and any other pertinent obligations, then as a consequence you |
|
381 |
may not distribute the Library at all. For example, if a patent |
|
382 |
license would not permit royalty-free redistribution of the Library by |
|
383 |
all those who receive copies directly or indirectly through you, then |
|
384 |
the only way you could satisfy both it and this License would be to |
|
385 |
refrain entirely from distribution of the Library. |
|
386 |
|
|
387 |
If any portion of this section is held invalid or unenforceable under any |
|
388 |
particular circumstance, the balance of the section is intended to apply, |
|
389 |
and the section as a whole is intended to apply in other circumstances. |
|
390 |
|
|
391 |
It is not the purpose of this section to induce you to infringe any |
|
392 |
patents or other property right claims or to contest validity of any |
|
393 |
such claims; this section has the sole purpose of protecting the |
|
394 |
integrity of the free software distribution system which is |
|
395 |
implemented by public license practices. Many people have made |
|
396 |
generous contributions to the wide range of software distributed |
|
397 |
through that system in reliance on consistent application of that |
|
398 |
system; it is up to the author/donor to decide if he or she is willing |
|
399 |
to distribute software through any other system and a licensee cannot |
|
400 |
impose that choice. |
|
401 |
|
|
402 |
This section is intended to make thoroughly clear what is believed to |
|
403 |
be a consequence of the rest of this License. |
|
404 |
|
|
405 |
12. If the distribution and/or use of the Library is restricted in |
|
406 |
certain countries either by patents or by copyrighted interfaces, the |
|
407 |
original copyright holder who places the Library under this License may add |
|
408 |
an explicit geographical distribution limitation excluding those countries, |
|
409 |
so that distribution is permitted only in or among countries not thus |
|
410 |
excluded. In such case, this License incorporates the limitation as if |
|
411 |
written in the body of this License. |
|
412 |
|
|
413 |
13. The Free Software Foundation may publish revised and/or new |
|
414 |
versions of the Lesser General Public License from time to time. |
|
415 |
Such new versions will be similar in spirit to the present version, |
|
416 |
but may differ in detail to address new problems or concerns. |
|
417 |
|
|
418 |
Each version is given a distinguishing version number. If the Library |
|
419 |
specifies a version number of this License which applies to it and |
|
420 |
"any later version", you have the option of following the terms and |
|
421 |
conditions either of that version or of any later version published by |
|
422 |
the Free Software Foundation. If the Library does not specify a |
|
423 |
license version number, you may choose any version ever published by |
|
424 |
the Free Software Foundation. |
|
425 |
|
|
426 |
14. If you wish to incorporate parts of the Library into other free |
|
427 |
programs whose distribution conditions are incompatible with these, |
|
428 |
write to the author to ask for permission. For software which is |
|
429 |
copyrighted by the Free Software Foundation, write to the Free |
|
430 |
Software Foundation; we sometimes make exceptions for this. Our |
|
431 |
decision will be guided by the two goals of preserving the free status |
|
432 |
of all derivatives of our free software and of promoting the sharing |
|
433 |
and reuse of software generally. |
|
434 |
|
|
435 |
NO WARRANTY |
|
436 |
|
|
437 |
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO |
|
438 |
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. |
|
439 |
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR |
|
440 |
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY |
|
441 |
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE |
|
442 |
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR |
|
443 |
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE |
|
444 |
LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME |
|
445 |
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. |
|
446 |
|
|
447 |
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN |
|
448 |
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY |
|
449 |
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU |
|
450 |
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR |
|
451 |
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE |
|
452 |
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING |
|
453 |
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A |
|
454 |
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF |
|
455 |
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH |
|
456 |
DAMAGES. |
|
457 |
|
|
458 |
END OF TERMS AND CONDITIONS |
|
459 |
|
|
460 |
How to Apply These Terms to Your New Libraries |
|
461 |
|
|
462 |
If you develop a new library, and you want it to be of the greatest |
|
463 |
possible use to the public, we recommend making it free software that |
|
464 |
everyone can redistribute and change. You can do so by permitting |
|
465 |
redistribution under these terms (or, alternatively, under the terms of the |
|
466 |
ordinary General Public License). |
|
467 |
|
|
468 |
To apply these terms, attach the following notices to the library. It is |
|
469 |
safest to attach them to the start of each source file to most effectively |
|
470 |
convey the exclusion of warranty; and each file should have at least the |
|
471 |
"copyright" line and a pointer to where the full notice is found. |
|
472 |
|
|
473 |
<one line to give the library's name and a brief idea of what it does.> |
|
474 |
Copyright (C) <year> <name of author> |
|
475 |
|
|
476 |
This library is free software; you can redistribute it and/or |
|
477 |
modify it under the terms of the GNU Lesser General Public |
|
478 |
License as published by the Free Software Foundation; either |
|
479 |
version 2.1 of the License, or (at your option) any later version. |
|
480 |
|
|
481 |
This library is distributed in the hope that it will be useful, |
|
482 |
but WITHOUT ANY WARRANTY; without even the implied warranty of |
|
483 |
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
|
484 |
Lesser General Public License for more details. |
|
485 |
|
|
486 |
You should have received a copy of the GNU Lesser General Public |
|
487 |
License along with this library; if not, write to the Free Software |
|
488 |
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
|
489 |
|
|
490 |
Also add information on how to contact you by electronic and paper mail. |
|
491 |
|
|
492 |
You should also get your employer (if you work as a programmer) or your |
|
493 |
school, if any, to sign a "copyright disclaimer" for the library, if |
|
494 |
necessary. Here is a sample; alter the names: |
|
495 |
|
|
496 |
Yoyodyne, Inc., hereby disclaims all copyright interest in the |
|
497 |
library `Frob' (a library for tweaking knobs) written by James Random Hacker. |
|
498 |
|
|
499 |
<signature of Ty Coon>, 1 April 1990 |
|
500 |
Ty Coon, President of Vice |
|
501 |
|
|
502 |
That's all there is to it! |
branches/2.8.x/wb/include/idna_convert/idna_convert.class.php | ||
---|---|---|
1 |
<?php |
|
2 |
// {{{ license |
|
3 |
|
|
4 |
/* vim: set expandtab tabstop=4 shiftwidth=4 softtabstop=4 foldmethod=marker: */ |
|
5 |
// |
|
6 |
// +----------------------------------------------------------------------+ |
|
7 |
// | This library is free software; you can redistribute it and/or modify | |
|
8 |
// | it under the terms of the GNU Lesser General Public License as | |
|
9 |
// | published by the Free Software Foundation; either version 2.1 of the | |
|
10 |
// | License, or (at your option) any later version. | |
|
11 |
// | | |
|
12 |
// | This library is distributed in the hope that it will be useful, but | |
|
13 |
// | WITHOUT ANY WARRANTY; without even the implied warranty of | |
|
14 |
// | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU | |
|
15 |
// | Lesser General Public License for more details. | |
|
16 |
// | | |
|
17 |
// | You should have received a copy of the GNU Lesser General Public | |
|
18 |
// | License along with this library; if not, write to the Free Software | |
|
19 |
// | Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 | |
|
20 |
// | USA. | |
|
21 |
// +----------------------------------------------------------------------+ |
|
22 |
// |
|
23 |
|
|
24 |
// }}} |
|
25 |
|
|
26 |
/** |
|
27 |
* Encode/decode Internationalized Domain Names. |
|
28 |
* |
|
29 |
* The class allows to convert internationalized domain names |
|
30 |
* (see RFC 3490 for details) as they can be used with various registries worldwide |
|
31 |
* to be translated between their original (localized) form and their encoded form |
|
32 |
* as it will be used in the DNS (Domain Name System). |
|
33 |
* |
|
34 |
* The class provides two public methods, encode() and decode(), which do exactly |
|
35 |
* what you would expect them to do. You are allowed to use complete domain names, |
|
36 |
* simple strings and complete email addresses as well. That means, that you might |
|
37 |
* use any of the following notations: |
|
38 |
* |
|
39 |
* - www.n?rgler.com |
|
40 |
* - xn--nrgler-wxa |
|
41 |
* - xn--brse-5qa.xn--knrz-1ra.info |
|
42 |
* |
|
43 |
* Unicode input might be given as either UTF-8 string, UCS-4 string or UCS-4 array. |
|
44 |
* Unicode output is available in the same formats. |
|
45 |
* You can select your preferred format via {@link set_paramter()}. |
|
46 |
* |
|
47 |
* ACE input and output is always expected to be ASCII. |
|
48 |
* |
|
49 |
* @author Matthias Sommerfeld <mso@phlylabs.de> |
|
50 |
* @author Leonid Kogan <lko@neuse.de> |
|
51 |
* @copyright 2004-2010 phlyLabs Berlin, http://phlylabs.de |
|
52 |
* @version 0.6.9 2010-11-04 |
|
53 |
*/ |
|
54 |
class idna_convert |
|
55 |
{ |
|
56 |
// NP See below |
|
57 |
|
|
58 |
// Internal settings, do not mess with them |
|
59 |
protected $_punycode_prefix = 'xn--'; |
|
60 |
protected $_invalid_ucs = 0x80000000; |
|
61 |
protected $_max_ucs = 0x10FFFF; |
|
62 |
protected $_base = 36; |
|
63 |
protected $_tmin = 1; |
|
64 |
protected $_tmax = 26; |
|
65 |
protected $_skew = 38; |
|
66 |
protected $_damp = 700; |
|
67 |
protected $_initial_bias = 72; |
|
68 |
protected $_initial_n = 0x80; |
|
69 |
protected $_sbase = 0xAC00; |
|
70 |
protected $_lbase = 0x1100; |
|
71 |
protected $_vbase = 0x1161; |
|
72 |
protected $_tbase = 0x11A7; |
|
73 |
protected $_lcount = 19; |
|
74 |
protected $_vcount = 21; |
|
75 |
protected $_tcount = 28; |
|
76 |
protected $_ncount = 588; // _vcount * _tcount |
|
77 |
protected $_scount = 11172; // _lcount * _tcount * _vcount |
|
78 |
protected $_error = false; |
|
79 |
|
|
80 |
// See {@link set_paramter()} for details of how to change the following |
|
81 |
// settings from within your script / application |
|
82 |
protected $_api_encoding = 'utf8'; // Default input charset is UTF-8 |
|
83 |
protected $_allow_overlong = false; // Overlong UTF-8 encodings are forbidden |
|
84 |
protected $_strict_mode = false; // Behave strict or not |
|
85 |
protected $_encode_german_sz = true; // True to encode German ?; False, if not |
|
86 |
|
|
87 |
/** |
|
88 |
* the constructor |
|
89 |
* |
|
90 |
* @param array $options |
|
91 |
* @return boolean |
|
92 |
* @since 0.5.2 |
|
93 |
*/ |
|
94 |
public function __construct($options = false) |
|
95 |
{ |
|
96 |
$this->slast = $this->_sbase + $this->_lcount * $this->_vcount * $this->_tcount; |
|
97 |
// If parameters are given, pass these to the respective method |
|
98 |
if (is_array($options)) return $this->set_parameter($options); |
|
99 |
if (!$this->_encode_german_sz) { |
|
100 |
$this->NP['replacemaps'][0xDF] = array(0x73, 0x73); |
|
101 |
} |
|
102 |
} |
|
103 |
|
|
104 |
/** |
|
105 |
* Sets a new option value. Available options and values: |
|
106 |
* [encoding - Use either UTF-8, UCS4 as array or UCS4 as string as input ('utf8' for UTF-8, |
|
107 |
* 'ucs4_string' and 'ucs4_array' respectively for UCS4); The output is always UTF-8] |
|
108 |
* [overlong - Unicode does not allow unnecessarily long encodings of chars, |
|
109 |
* to allow this, set this parameter to true, else to false; |
|
110 |
* default is false.] |
|
111 |
* [strict - true: strict mode, good for registration purposes - Causes errors |
|
112 |
* on failures; false: loose mode, ideal for "wildlife" applications |
|
113 |
* by silently ignoring errors and returning the original input instead |
|
114 |
* |
|
115 |
* @param mixed Parameter to set (string: single parameter; array of Parameter => Value pairs) |
|
116 |
* @param string Value to use (if parameter 1 is a string) |
|
117 |
* @return boolean true on success, false otherwise |
|
118 |
*/ |
|
119 |
public function set_parameter($option, $value = false) |
|
120 |
{ |
|
121 |
if (!is_array($option)) { |
|
122 |
$option = array($option => $value); |
|
123 |
} |
|
124 |
foreach ($option as $k => $v) { |
|
125 |
switch ($k) { |
|
126 |
case 'encoding': |
|
127 |
switch ($v) { |
|
128 |
case 'utf8': |
|
129 |
case 'ucs4_string': |
|
130 |
case 'ucs4_array': |
|
131 |
$this->_api_encoding = $v; |
|
132 |
break; |
|
133 |
default: |
|
134 |
$this->_error('Set Parameter: Unknown parameter '.$v.' for option '.$k); |
|
135 |
return false; |
|
136 |
} |
|
137 |
break; |
|
138 |
case 'overlong': |
|
139 |
$this->_allow_overlong = ($v) ? true : false; |
|
140 |
break; |
|
141 |
case 'strict': |
|
142 |
$this->_strict_mode = ($v) ? true : false; |
|
143 |
break; |
|
144 |
case 'encode_german_sz': |
|
145 |
$this->_encode_german_sz = ($v) ? true : false; |
|
146 |
break; |
|
147 |
default: |
|
148 |
$this->_error('Set Parameter: Unknown option '.$k); |
|
149 |
return false; |
|
150 |
} |
|
151 |
} |
|
152 |
return true; |
|
153 |
} |
|
154 |
|
|
155 |
/** |
|
156 |
* Decode a given ACE domain name |
|
157 |
* @param string Domain name (ACE string) |
|
158 |
* [@param string Desired output encoding, see {@link set_parameter}] |
|
159 |
* @return string Decoded Domain name (UTF-8 or UCS-4) |
|
160 |
*/ |
|
161 |
public function decode($input, $one_time_encoding = false) |
|
162 |
{ |
|
163 |
// Optionally set |
|
164 |
if ($one_time_encoding) { |
|
165 |
switch ($one_time_encoding) { |
|
166 |
case 'utf8': |
|
167 |
case 'ucs4_string': |
|
168 |
case 'ucs4_array': |
|
169 |
break; |
|
170 |
default: |
|
171 |
$this->_error('Unknown encoding '.$one_time_encoding); |
|
172 |
return false; |
|
173 |
} |
|
174 |
} |
|
175 |
// Make sure to drop any newline characters around |
|
176 |
$input = trim($input); |
|
177 |
|
|
178 |
// Negotiate input and try to determine, whether it is a plain string, |
|
179 |
// an email address or something like a complete URL |
|
180 |
if (strpos($input, '@')) { // Maybe it is an email address |
|
181 |
// No no in strict mode |
|
182 |
if ($this->_strict_mode) { |
|
183 |
$this->_error('Only simple domain name parts can be handled in strict mode'); |
|
184 |
return false; |
|
185 |
} |
|
186 |
list ($email_pref, $input) = explode('@', $input, 2); |
|
187 |
$arr = explode('.', $input); |
|
188 |
foreach ($arr as $k => $v) { |
|
189 |
if (preg_match('!^'.preg_quote($this->_punycode_prefix, '!').'!', $v)) { |
|
190 |
$conv = $this->_decode($v); |
|
191 |
if ($conv) $arr[$k] = $conv; |
|
192 |
} |
|
193 |
} |
|
194 |
$input = join('.', $arr); |
|
195 |
$arr = explode('.', $email_pref); |
|
196 |
foreach ($arr as $k => $v) { |
|
197 |
if (preg_match('!^'.preg_quote($this->_punycode_prefix, '!').'!', $v)) { |
|
198 |
$conv = $this->_decode($v); |
|
199 |
if ($conv) $arr[$k] = $conv; |
|
200 |
} |
|
201 |
} |
|
202 |
$email_pref = join('.', $arr); |
|
203 |
$return = $email_pref . '@' . $input; |
|
204 |
} elseif (preg_match('![:\./]!', $input)) { // Or a complete domain name (with or without paths / parameters) |
|
205 |
// No no in strict mode |
|
206 |
if ($this->_strict_mode) { |
|
207 |
$this->_error('Only simple domain name parts can be handled in strict mode'); |
|
208 |
return false; |
|
209 |
} |
|
210 |
$parsed = parse_url($input); |
|
211 |
if (isset($parsed['host'])) { |
|
212 |
$arr = explode('.', $parsed['host']); |
|
213 |
foreach ($arr as $k => $v) { |
|
214 |
$conv = $this->_decode($v); |
|
215 |
if ($conv) $arr[$k] = $conv; |
|
216 |
} |
|
217 |
$parsed['host'] = join('.', $arr); |
|
218 |
$return = |
|
219 |
(empty($parsed['scheme']) ? '' : $parsed['scheme'].(strtolower($parsed['scheme']) == 'mailto' ? ':' : '://')) |
|
220 |
.(empty($parsed['user']) ? '' : $parsed['user'].(empty($parsed['pass']) ? '' : ':'.$parsed['pass']).'@') |
|
221 |
.$parsed['host'] |
|
222 |
.(empty($parsed['port']) ? '' : ':'.$parsed['port']) |
|
223 |
.(empty($parsed['path']) ? '' : $parsed['path']) |
|
224 |
.(empty($parsed['query']) ? '' : '?'.$parsed['query']) |
|
225 |
.(empty($parsed['fragment']) ? '' : '#'.$parsed['fragment']); |
|
226 |
} else { // parse_url seems to have failed, try without it |
|
227 |
$arr = explode('.', $input); |
|
228 |
foreach ($arr as $k => $v) { |
|
229 |
$conv = $this->_decode($v); |
|
230 |
$arr[$k] = ($conv) ? $conv : $v; |
|
231 |
} |
|
232 |
$return = join('.', $arr); |
|
233 |
} |
|
234 |
} else { // Otherwise we consider it being a pure domain name string |
|
235 |
$return = $this->_decode($input); |
|
236 |
if (!$return) $return = $input; |
|
237 |
} |
|
238 |
// The output is UTF-8 by default, other output formats need conversion here |
|
239 |
// If one time encoding is given, use this, else the objects property |
|
240 |
switch (($one_time_encoding) ? $one_time_encoding : $this->_api_encoding) { |
|
241 |
case 'utf8': |
|
242 |
return $return; |
|
243 |
break; |
|
244 |
case 'ucs4_string': |
|
245 |
return $this->_ucs4_to_ucs4_string($this->_utf8_to_ucs4($return)); |
|
246 |
break; |
|
247 |
case 'ucs4_array': |
|
248 |
return $this->_utf8_to_ucs4($return); |
|
249 |
break; |
|
250 |
default: |
|
251 |
$this->_error('Unsupported output format'); |
|
252 |
return false; |
|
253 |
} |
|
254 |
} |
|
255 |
|
|
256 |
/** |
|
257 |
* Encode a given UTF-8 domain name |
|
258 |
* @param string Domain name (UTF-8 or UCS-4) |
|
259 |
* [@param string Desired input encoding, see {@link set_parameter}] |
|
260 |
* @return string Encoded Domain name (ACE string) |
|
261 |
*/ |
|
262 |
public function encode($decoded, $one_time_encoding = false) |
|
263 |
{ |
|
264 |
// Forcing conversion of input to UCS4 array |
|
265 |
// If one time encoding is given, use this, else the objects property |
|
266 |
switch ($one_time_encoding ? $one_time_encoding : $this->_api_encoding) { |
|
267 |
case 'utf8': |
|
268 |
$decoded = $this->_utf8_to_ucs4($decoded); |
|
269 |
break; |
|
270 |
case 'ucs4_string': |
|
271 |
$decoded = $this->_ucs4_string_to_ucs4($decoded); |
|
272 |
case 'ucs4_array': |
|
273 |
break; |
|
274 |
default: |
|
275 |
$this->_error('Unsupported input format: '.($one_time_encoding ? $one_time_encoding : $this->_api_encoding)); |
|
276 |
return false; |
|
277 |
} |
|
278 |
|
|
279 |
// No input, no output, what else did you expect? |
|
280 |
if (empty($decoded)) return ''; |
|
281 |
|
|
282 |
// Anchors for iteration |
|
283 |
$last_begin = 0; |
|
284 |
// Output string |
|
285 |
$output = ''; |
|
286 |
foreach ($decoded as $k => $v) { |
|
287 |
// Make sure to use just the plain dot |
|
288 |
switch($v) { |
|
289 |
case 0x3002: |
|
290 |
case 0xFF0E: |
|
291 |
case 0xFF61: |
|
292 |
$decoded[$k] = 0x2E; |
|
293 |
// Right, no break here, the above are converted to dots anyway |
|
294 |
// Stumbling across an anchoring character |
|
295 |
case 0x2E: |
|
296 |
case 0x2F: |
|
297 |
case 0x3A: |
|
298 |
case 0x3F: |
|
299 |
case 0x40: |
|
300 |
// Neither email addresses nor URLs allowed in strict mode |
|
301 |
if ($this->_strict_mode) { |
|
302 |
$this->_error('Neither email addresses nor URLs are allowed in strict mode.'); |
|
303 |
return false; |
|
304 |
} else { |
|
305 |
// Skip first char |
|
306 |
if ($k) { |
|
307 |
$encoded = ''; |
|
308 |
$encoded = $this->_encode(array_slice($decoded, $last_begin, (($k)-$last_begin))); |
|
309 |
if ($encoded) { |
|
310 |
$output .= $encoded; |
|
311 |
} else { |
|
312 |
$output .= $this->_ucs4_to_utf8(array_slice($decoded, $last_begin, (($k)-$last_begin))); |
|
313 |
} |
|
314 |
$output .= chr($decoded[$k]); |
|
315 |
} |
|
316 |
$last_begin = $k + 1; |
|
317 |
} |
|
318 |
} |
|
319 |
} |
|
320 |
// Catch the rest of the string |
|
321 |
if ($last_begin) { |
|
322 |
$inp_len = sizeof($decoded); |
|
323 |
$encoded = ''; |
|
324 |
$encoded = $this->_encode(array_slice($decoded, $last_begin, (($inp_len)-$last_begin))); |
|
325 |
if ($encoded) { |
|
326 |
$output .= $encoded; |
|
327 |
} else { |
|
328 |
$output .= $this->_ucs4_to_utf8(array_slice($decoded, $last_begin, (($inp_len)-$last_begin))); |
|
329 |
} |
|
330 |
return $output; |
|
331 |
} else { |
|
332 |
if ($output = $this->_encode($decoded)) { |
|
333 |
return $output; |
|
334 |
} else { |
|
335 |
return $this->_ucs4_to_utf8($decoded); |
|
336 |
} |
|
337 |
} |
|
338 |
} |
|
339 |
|
|
340 |
/** |
|
341 |
* Removes a weakness of encode(), which cannot properly handle URIs but instead encodes their |
|
342 |
* path or query components, too. |
|
343 |
* @param string $uri Expects the URI as a UTF-8 (or ASCII) string |
|
344 |
* @return string The URI encoded to Punycode, everything but the host component is left alone |
|
345 |
* @since 0.6.4 |
|
346 |
*/ |
|
347 |
public function encode_uri($uri) |
|
348 |
{ |
|
349 |
$parsed = parse_url($uri); |
|
350 |
if (!isset($parsed['host'])) { |
|
351 |
$this->_error('The given string does not look like a URI'); |
|
352 |
return false; |
|
353 |
} |
|
354 |
$arr = explode('.', $parsed['host']); |
|
355 |
foreach ($arr as $k => $v) { |
|
356 |
$conv = $this->encode($v, 'utf8'); |
|
357 |
if ($conv) $arr[$k] = $conv; |
|
358 |
} |
|
359 |
$parsed['host'] = join('.', $arr); |
|
360 |
$return = |
|
361 |
(empty($parsed['scheme']) ? '' : $parsed['scheme'].(strtolower($parsed['scheme']) == 'mailto' ? ':' : '://')) |
|
362 |
.(empty($parsed['user']) ? '' : $parsed['user'].(empty($parsed['pass']) ? '' : ':'.$parsed['pass']).'@') |
|
363 |
.$parsed['host'] |
|
364 |
.(empty($parsed['port']) ? '' : ':'.$parsed['port']) |
|
365 |
.(empty($parsed['path']) ? '' : $parsed['path']) |
|
366 |
.(empty($parsed['query']) ? '' : '?'.$parsed['query']) |
|
367 |
.(empty($parsed['fragment']) ? '' : '#'.$parsed['fragment']); |
|
368 |
return $return; |
|
369 |
} |
|
370 |
|
|
371 |
/** |
|
372 |
* Use this method to get the last error ocurred |
|
373 |
* @param void |
|
374 |
* @return string The last error, that occured |
|
375 |
*/ |
|
376 |
public function get_last_error() |
|
377 |
{ |
|
378 |
return $this->_error; |
|
379 |
} |
|
380 |
|
|
381 |
/** |
|
382 |
* The actual decoding algorithm |
|
383 |
* @param string |
|
384 |
* @return mixed |
|
385 |
*/ |
|
386 |
protected function _decode($encoded) |
|
387 |
{ |
|
388 |
$decoded = array(); |
|
389 |
// find the Punycode prefix |
|
390 |
if (!preg_match('!^'.preg_quote($this->_punycode_prefix, '!').'!', $encoded)) { |
|
391 |
$this->_error('This is not a punycode string'); |
|
392 |
return false; |
|
393 |
} |
|
394 |
$encode_test = preg_replace('!^'.preg_quote($this->_punycode_prefix, '!').'!', '', $encoded); |
|
395 |
// If nothing left after removing the prefix, it is hopeless |
|
396 |
if (!$encode_test) { |
|
397 |
$this->_error('The given encoded string was empty'); |
|
398 |
return false; |
|
399 |
} |
|
400 |
// Find last occurence of the delimiter |
|
401 |
$delim_pos = strrpos($encoded, '-'); |
|
402 |
if ($delim_pos > strlen($this->_punycode_prefix)) { |
|
403 |
for ($k = strlen($this->_punycode_prefix); $k < $delim_pos; ++$k) { |
|
404 |
$decoded[] = ord($encoded{$k}); |
|
405 |
} |
|
406 |
} |
|
407 |
$deco_len = count($decoded); |
|
408 |
$enco_len = strlen($encoded); |
|
409 |
|
|
410 |
// Wandering through the strings; init |
|
411 |
$is_first = true; |
|
412 |
$bias = $this->_initial_bias; |
|
413 |
$idx = 0; |
|
414 |
$char = $this->_initial_n; |
|
415 |
|
|
416 |
for ($enco_idx = ($delim_pos) ? ($delim_pos + 1) : 0; $enco_idx < $enco_len; ++$deco_len) { |
|
417 |
for ($old_idx = $idx, $w = 1, $k = $this->_base; 1 ; $k += $this->_base) { |
|
418 |
$digit = $this->_decode_digit($encoded{$enco_idx++}); |
|
419 |
$idx += $digit * $w; |
|
420 |
$t = ($k <= $bias) ? $this->_tmin : |
|
421 |
(($k >= $bias + $this->_tmax) ? $this->_tmax : ($k - $bias)); |
|
422 |
if ($digit < $t) break; |
|
423 |
$w = (int) ($w * ($this->_base - $t)); |
|
424 |
} |
|
425 |
$bias = $this->_adapt($idx - $old_idx, $deco_len + 1, $is_first); |
|
426 |
$is_first = false; |
|
427 |
$char += (int) ($idx / ($deco_len + 1)); |
|
428 |
$idx %= ($deco_len + 1); |
|
429 |
if ($deco_len > 0) { |
|
430 |
// Make room for the decoded char |
|
431 |
for ($i = $deco_len; $i > $idx; $i--) $decoded[$i] = $decoded[($i - 1)]; |
|
432 |
} |
|
433 |
$decoded[$idx++] = $char; |
|
434 |
} |
|
435 |
return $this->_ucs4_to_utf8($decoded); |
|
436 |
} |
|
437 |
|
|
438 |
/** |
|
439 |
* The actual encoding algorithm |
|
440 |
* @param string |
|
441 |
* @return mixed |
|
442 |
*/ |
|
443 |
protected function _encode($decoded) |
|
444 |
{ |
|
445 |
// We cannot encode a domain name containing the Punycode prefix |
|
446 |
$extract = strlen($this->_punycode_prefix); |
|
447 |
$check_pref = $this->_utf8_to_ucs4($this->_punycode_prefix); |
|
448 |
$check_deco = array_slice($decoded, 0, $extract); |
|
449 |
|
|
450 |
if ($check_pref == $check_deco) { |
|
451 |
$this->_error('This is already a punycode string'); |
|
452 |
return false; |
|
453 |
} |
|
454 |
// We will not try to encode strings consisting of basic code points only |
|
455 |
$encodable = false; |
|
456 |
foreach ($decoded as $k => $v) { |
|
457 |
if ($v > 0x7a) { |
|
458 |
$encodable = true; |
|
459 |
break; |
|
460 |
} |
|
461 |
} |
|
462 |
if (!$encodable) { |
|
463 |
$this->_error('The given string does not contain encodable chars'); |
|
464 |
return false; |
|
465 |
} |
|
466 |
// Do NAMEPREP |
|
467 |
$decoded = $this->_nameprep($decoded); |
|
468 |
if (!$decoded || !is_array($decoded)) return false; // NAMEPREP failed |
|
469 |
$deco_len = count($decoded); |
|
470 |
if (!$deco_len) return false; // Empty array |
|
471 |
$codecount = 0; // How many chars have been consumed |
|
472 |
$encoded = ''; |
|
473 |
// Copy all basic code points to output |
|
474 |
for ($i = 0; $i < $deco_len; ++$i) { |
|
475 |
$test = $decoded[$i]; |
|
476 |
// Will match [-0-9a-zA-Z] |
|
477 |
if ((0x2F < $test && $test < 0x40) || (0x40 < $test && $test < 0x5B) |
|
478 |
|| (0x60 < $test && $test <= 0x7B) || (0x2D == $test)) { |
|
479 |
$encoded .= chr($decoded[$i]); |
|
480 |
$codecount++; |
|
481 |
} |
|
482 |
} |
|
483 |
if ($codecount == $deco_len) return $encoded; // All codepoints were basic ones |
|
484 |
|
|
485 |
// Start with the prefix; copy it to output |
|
486 |
$encoded = $this->_punycode_prefix.$encoded; |
|
487 |
// If we have basic code points in output, add an hyphen to the end |
|
488 |
if ($codecount) $encoded .= '-'; |
|
489 |
// Now find and encode all non-basic code points |
|
490 |
$is_first = true; |
|
491 |
$cur_code = $this->_initial_n; |
|
492 |
$bias = $this->_initial_bias; |
|
493 |
$delta = 0; |
|
494 |
while ($codecount < $deco_len) { |
|
495 |
// Find the smallest code point >= the current code point and |
|
496 |
// remember the last ouccrence of it in the input |
|
497 |
for ($i = 0, $next_code = $this->_max_ucs; $i < $deco_len; $i++) { |
|
498 |
if ($decoded[$i] >= $cur_code && $decoded[$i] <= $next_code) { |
|
499 |
$next_code = $decoded[$i]; |
|
500 |
} |
|
501 |
} |
|
502 |
$delta += ($next_code - $cur_code) * ($codecount + 1); |
|
503 |
$cur_code = $next_code; |
|
504 |
|
|
505 |
// Scan input again and encode all characters whose code point is $cur_code |
|
506 |
for ($i = 0; $i < $deco_len; $i++) { |
|
507 |
if ($decoded[$i] < $cur_code) { |
|
508 |
$delta++; |
|
509 |
} elseif ($decoded[$i] == $cur_code) { |
|
510 |
for ($q = $delta, $k = $this->_base; 1; $k += $this->_base) { |
|
511 |
$t = ($k <= $bias) ? $this->_tmin : |
|
512 |
(($k >= $bias + $this->_tmax) ? $this->_tmax : $k - $bias); |
|
513 |
if ($q < $t) break; |
|
514 |
$encoded .= $this->_encode_digit(intval($t + (($q - $t) % ($this->_base - $t)))); //v0.4.5 Changed from ceil() to intval() |
|
515 |
$q = (int) (($q - $t) / ($this->_base - $t)); |
|
516 |
} |
|
517 |
$encoded .= $this->_encode_digit($q); |
|
518 |
$bias = $this->_adapt($delta, $codecount+1, $is_first); |
|
519 |
$codecount++; |
|
520 |
$delta = 0; |
|
521 |
$is_first = false; |
|
522 |
} |
|
523 |
} |
|
524 |
$delta++; |
|
525 |
$cur_code++; |
|
526 |
} |
|
527 |
return $encoded; |
|
528 |
} |
|
529 |
|
|
530 |
/** |
|
531 |
* Adapt the bias according to the current code point and position |
|
532 |
* @param int $delta |
|
533 |
* @param int $npoints |
|
534 |
* @param int $is_first |
|
535 |
* @return int |
|
536 |
*/ |
|
537 |
protected function _adapt($delta, $npoints, $is_first) |
|
538 |
{ |
|
539 |
$delta = intval($is_first ? ($delta / $this->_damp) : ($delta / 2)); |
|
540 |
$delta += intval($delta / $npoints); |
|
541 |
for ($k = 0; $delta > (($this->_base - $this->_tmin) * $this->_tmax) / 2; $k += $this->_base) { |
|
542 |
$delta = intval($delta / ($this->_base - $this->_tmin)); |
|
543 |
} |
|
544 |
return intval($k + ($this->_base - $this->_tmin + 1) * $delta / ($delta + $this->_skew)); |
|
545 |
} |
|
546 |
|
|
547 |
/** |
|
548 |
* Encoding a certain digit |
|
549 |
* @param int $d |
|
550 |
* @return string |
|
551 |
*/ |
|
552 |
protected function _encode_digit($d) |
|
553 |
{ |
|
554 |
return chr($d + 22 + 75 * ($d < 26)); |
|
555 |
} |
|
556 |
|
|
557 |
/** |
|
558 |
* Decode a certain digit |
|
559 |
* @param int $cp |
|
560 |
* @return int |
|
561 |
*/ |
|
562 |
protected function _decode_digit($cp) |
|
563 |
{ |
|
564 |
$cp = ord($cp); |
|
565 |
return ($cp - 48 < 10) ? $cp - 22 : (($cp - 65 < 26) ? $cp - 65 : (($cp - 97 < 26) ? $cp - 97 : $this->_base)); |
|
566 |
} |
|
567 |
|
|
568 |
/** |
|
569 |
* Internal error handling method |
|
570 |
* @param string $error |
|
571 |
*/ |
|
572 |
protected function _error($error = '') |
|
573 |
{ |
|
574 |
$this->_error = $error; |
|
575 |
} |
|
576 |
|
|
577 |
/** |
|
578 |
* Do Nameprep according to RFC3491 and RFC3454 |
|
579 |
* @param array Unicode Characters |
|
580 |
* @return string Unicode Characters, Nameprep'd |
|
581 |
*/ |
|
582 |
protected function _nameprep($input) |
|
583 |
{ |
|
584 |
$output = array(); |
|
585 |
$error = false; |
|
586 |
// |
|
587 |
// Mapping |
|
588 |
// Walking through the input array, performing the required steps on each of |
|
589 |
// the input chars and putting the result into the output array |
|
590 |
// While mapping required chars we apply the cannonical ordering |
|
591 |
foreach ($input as $v) { |
|
592 |
// Map to nothing == skip that code point |
|
593 |
if (in_array($v, $this->NP['map_nothing'])) continue; |
|
594 |
// Try to find prohibited input |
|
595 |
if (in_array($v, $this->NP['prohibit']) || in_array($v, $this->NP['general_prohibited'])) { |
|
596 |
$this->_error('NAMEPREP: Prohibited input U+'.sprintf('%08X', $v)); |
|
597 |
return false; |
|
598 |
} |
|
599 |
foreach ($this->NP['prohibit_ranges'] as $range) { |
|
600 |
if ($range[0] <= $v && $v <= $range[1]) { |
|
601 |
$this->_error('NAMEPREP: Prohibited input U+'.sprintf('%08X', $v)); |
|
602 |
return false; |
|
603 |
} |
|
604 |
} |
|
605 |
// Hangul syllable decomposition |
|
606 |
if (0xAC00 <= $v && $v <= 0xD7AF) { |
|
607 |
foreach ($this->_hangul_decompose($v) as $out) $output[] = (int) $out; |
|
608 |
// There's a decomposition mapping for that code point |
|
609 |
} elseif (isset($this->NP['replacemaps'][$v])) { |
|
610 |
foreach ($this->_apply_cannonical_ordering($this->NP['replacemaps'][$v]) as $out) { |
|
611 |
$output[] = (int) $out; |
|
612 |
} |
|
613 |
} else { |
|
614 |
$output[] = (int) $v; |
|
615 |
} |
|
616 |
} |
|
617 |
// Before applying any Combining, try to rearrange any Hangul syllables |
|
618 |
$output = $this->_hangul_compose($output); |
|
619 |
// |
|
620 |
// Combine code points |
|
621 |
// |
|
622 |
$last_class = 0; |
|
623 |
$last_starter = 0; |
|
624 |
$out_len = count($output); |
|
625 |
for ($i = 0; $i < $out_len; ++$i) { |
|
626 |
$class = $this->_get_combining_class($output[$i]); |
|
627 |
if ((!$last_class || $last_class > $class) && $class) { |
|
628 |
// Try to match |
|
629 |
$seq_len = $i - $last_starter; |
|
630 |
$out = $this->_combine(array_slice($output, $last_starter, $seq_len)); |
|
631 |
// On match: Replace the last starter with the composed character and remove |
|
632 |
// the now redundant non-starter(s) |
|
633 |
if ($out) { |
|
634 |
$output[$last_starter] = $out; |
|
635 |
if (count($out) != $seq_len) { |
|
636 |
for ($j = $i+1; $j < $out_len; ++$j) $output[$j-1] = $output[$j]; |
|
637 |
unset($output[$out_len]); |
|
638 |
} |
|
639 |
// Rewind the for loop by one, since there can be more possible compositions |
|
640 |
$i--; |
|
641 |
$out_len--; |
|
642 |
$last_class = ($i == $last_starter) ? 0 : $this->_get_combining_class($output[$i-1]); |
|
643 |
continue; |
|
644 |
} |
|
645 |
} |
|
646 |
// The current class is 0 |
|
647 |
if (!$class) $last_starter = $i; |
|
648 |
$last_class = $class; |
|
649 |
} |
|
650 |
return $output; |
|
651 |
} |
|
652 |
|
|
653 |
/** |
|
654 |
* Decomposes a Hangul syllable |
|
655 |
* (see http://www.unicode.org/unicode/reports/tr15/#Hangul |
|
656 |
* @param integer 32bit UCS4 code point |
|
657 |
* @return array Either Hangul Syllable decomposed or original 32bit value as one value array |
|
658 |
*/ |
|
659 |
protected function _hangul_decompose($char) |
|
660 |
{ |
|
661 |
$sindex = (int) $char - $this->_sbase; |
|
662 |
if ($sindex < 0 || $sindex >= $this->_scount) return array($char); |
|
663 |
$result = array(); |
|
664 |
$result[] = (int) $this->_lbase + $sindex / $this->_ncount; |
|
665 |
$result[] = (int) $this->_vbase + ($sindex % $this->_ncount) / $this->_tcount; |
|
666 |
$T = intval($this->_tbase + $sindex % $this->_tcount); |
|
667 |
if ($T != $this->_tbase) $result[] = $T; |
|
668 |
return $result; |
|
669 |
} |
|
670 |
/** |
|
671 |
* Ccomposes a Hangul syllable |
|
672 |
* (see http://www.unicode.org/unicode/reports/tr15/#Hangul |
|
673 |
* @param array Decomposed UCS4 sequence |
|
674 |
* @return array UCS4 sequence with syllables composed |
|
675 |
*/ |
|
676 |
protected function _hangul_compose($input) |
|
677 |
{ |
|
678 |
$inp_len = count($input); |
|
679 |
if (!$inp_len) return array(); |
|
680 |
$result = array(); |
|
681 |
$last = (int) $input[0]; |
|
682 |
$result[] = $last; // copy first char from input to output |
|
683 |
|
|
684 |
for ($i = 1; $i < $inp_len; ++$i) { |
|
685 |
$char = (int) $input[$i]; |
|
686 |
$sindex = $last - $this->_sbase; |
|
687 |
$lindex = $last - $this->_lbase; |
|
688 |
$vindex = $char - $this->_vbase; |
|
689 |
$tindex = $char - $this->_tbase; |
|
690 |
// Find out, whether two current characters are LV and T |
|
691 |
if (0 <= $sindex && $sindex < $this->_scount && ($sindex % $this->_tcount == 0) |
|
692 |
&& 0 <= $tindex && $tindex <= $this->_tcount) { |
|
693 |
// create syllable of form LVT |
|
694 |
$last += $tindex; |
|
695 |
$result[(count($result) - 1)] = $last; // reset last |
|
696 |
continue; // discard char |
|
697 |
} |
|
698 |
// Find out, whether two current characters form L and V |
|
699 |
if (0 <= $lindex && $lindex < $this->_lcount && 0 <= $vindex && $vindex < $this->_vcount) { |
|
700 |
// create syllable of form LV |
|
701 |
$last = (int) $this->_sbase + ($lindex * $this->_vcount + $vindex) * $this->_tcount; |
|
702 |
$result[(count($result) - 1)] = $last; // reset last |
|
703 |
continue; // discard char |
|
704 |
} |
|
705 |
// if neither case was true, just add the character |
|
706 |
$last = $char; |
|
707 |
$result[] = $char; |
|
708 |
} |
|
709 |
return $result; |
|
710 |
} |
|
711 |
|
|
712 |
/** |
|
713 |
* Returns the combining class of a certain wide char |
|
714 |
* @param integer Wide char to check (32bit integer) |
|
715 |
* @return integer Combining class if found, else 0 |
|
716 |
*/ |
|
717 |
protected function _get_combining_class($char) |
|
718 |
{ |
|
719 |
return isset($this->NP['norm_combcls'][$char]) ? $this->NP['norm_combcls'][$char] : 0; |
|
720 |
} |
|
721 |
|
|
722 |
/** |
|
723 |
* Apllies the cannonical ordering of a decomposed UCS4 sequence |
|
724 |
* @param array Decomposed UCS4 sequence |
|
725 |
* @return array Ordered USC4 sequence |
|
726 |
*/ |
|
727 |
protected function _apply_cannonical_ordering($input) |
|
728 |
{ |
|
729 |
$swap = true; |
|
730 |
$size = count($input); |
|
731 |
while ($swap) { |
|
732 |
$swap = false; |
|
733 |
$last = $this->_get_combining_class(intval($input[0])); |
|
734 |
for ($i = 0; $i < $size-1; ++$i) { |
|
735 |
$next = $this->_get_combining_class(intval($input[$i+1])); |
|
736 |
if ($next != 0 && $last > $next) { |
|
737 |
// Move item leftward until it fits |
|
738 |
for ($j = $i + 1; $j > 0; --$j) { |
|
739 |
if ($this->_get_combining_class(intval($input[$j-1])) <= $next) break; |
|
740 |
$t = intval($input[$j]); |
|
741 |
$input[$j] = intval($input[$j-1]); |
|
742 |
$input[$j-1] = $t; |
|
743 |
$swap = true; |
|
744 |
} |
|
745 |
// Reentering the loop looking at the old character again |
|
746 |
$next = $last; |
|
747 |
} |
|
748 |
$last = $next; |
|
749 |
} |
|
750 |
} |
|
751 |
return $input; |
|
752 |
} |
|
753 |
|
|
754 |
/** |
|
755 |
* Do composition of a sequence of starter and non-starter |
|
756 |
* @param array UCS4 Decomposed sequence |
|
757 |
* @return array Ordered USC4 sequence |
|
758 |
*/ |
|
759 |
protected function _combine($input) |
|
760 |
{ |
|
761 |
$inp_len = count($input); |
|
762 |
foreach ($this->NP['replacemaps'] as $np_src => $np_target) { |
|
763 |
if ($np_target[0] != $input[0]) continue; |
|
764 |
if (count($np_target) != $inp_len) continue; |
|
765 |
$hit = false; |
|
766 |
foreach ($input as $k2 => $v2) { |
|
767 |
if ($v2 == $np_target[$k2]) { |
|
768 |
$hit = true; |
|
769 |
} else { |
|
770 |
$hit = false; |
|
771 |
break; |
|
772 |
} |
|
773 |
} |
|
774 |
if ($hit) return $np_src; |
|
775 |
} |
|
776 |
return false; |
|
777 |
} |
|
778 |
|
|
779 |
/** |
|
780 |
* This converts an UTF-8 encoded string to its UCS-4 representation |
|
781 |
* By talking about UCS-4 "strings" we mean arrays of 32bit integers representing |
|
782 |
* each of the "chars". This is due to PHP not being able to handle strings with |
|
783 |
* bit depth different from 8. This apllies to the reverse method _ucs4_to_utf8(), too. |
|
784 |
* The following UTF-8 encodings are supported: |
|
785 |
* bytes bits representation |
|
786 |
* 1 7 0xxxxxxx |
|
787 |
* 2 11 110xxxxx 10xxxxxx |
|
788 |
* 3 16 1110xxxx 10xxxxxx 10xxxxxx |
|
789 |
* 4 21 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
|
790 |
* 5 26 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
|
791 |
* 6 31 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
|
792 |
* Each x represents a bit that can be used to store character data. |
|
793 |
* The five and six byte sequences are part of Annex D of ISO/IEC 10646-1:2000 |
|
794 |
* @param string $input |
|
795 |
* @return string |
|
796 |
*/ |
|
797 |
protected function _utf8_to_ucs4($input) |
|
798 |
{ |
|
799 |
$output = array(); |
|
800 |
$out_len = 0; |
|
801 |
// Patch by Daniel Hahler; work around prolbem with mbstring.func_overload |
|
802 |
if (function_exists('mb_strlen')) { |
|
803 |
$inp_len = mb_strlen($input, '8bit'); |
|
804 |
} else { |
|
805 |
$inp_len = strlen($input); |
|
806 |
} |
|
807 |
$mode = 'next'; |
|
808 |
$test = 'none'; |
|
809 |
for ($k = 0; $k < $inp_len; ++$k) { |
|
810 |
$v = ord($input{$k}); // Extract byte from input string |
|
811 |
if ($v < 128) { // We found an ASCII char - put into stirng as is |
|
812 |
$output[$out_len] = $v; |
|
813 |
++$out_len; |
|
814 |
if ('add' == $mode) { |
|
815 |
$this->_error('Conversion from UTF-8 to UCS-4 failed: malformed input at byte '.$k); |
|
816 |
return false; |
|
817 |
} |
|
818 |
continue; |
|
819 |
} |
|
820 |
if ('next' == $mode) { // Try to find the next start byte; determine the width of the Unicode char |
|
821 |
$start_byte = $v; |
|
822 |
$mode = 'add'; |
|
823 |
$test = 'range'; |
|
824 |
if ($v >> 5 == 6) { // &110xxxxx 10xxxxx |
|
825 |
$next_byte = 0; // Tells, how many times subsequent bitmasks must rotate 6bits to the left |
|
826 |
$v = ($v - 192) << 6; |
|
827 |
} elseif ($v >> 4 == 14) { // &1110xxxx 10xxxxxx 10xxxxxx |
|
828 |
$next_byte = 1; |
|
829 |
$v = ($v - 224) << 12; |
|
830 |
} elseif ($v >> 3 == 30) { // &11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
|
831 |
$next_byte = 2; |
|
832 |
$v = ($v - 240) << 18; |
|
833 |
} elseif ($v >> 2 == 62) { // &111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
|
834 |
$next_byte = 3; |
|
835 |
$v = ($v - 248) << 24; |
|
836 |
} elseif ($v >> 1 == 126) { // &1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
|
837 |
$next_byte = 4; |
|
838 |
$v = ($v - 252) << 30; |
Also available in: Unified diff
fixed inclusion of SecureForm
added IDNA/Punycode to wb::validate_email()