Project

General

Profile

« Previous | Next » 

Revision 1378

Added by Luisehahne almost 14 years ago

fixed inclusion of SecureForm
added IDNA/Punycode to wb::validate_email()

View differences:

branches/2.8.x/CHANGELOG
11 11
! = Update/Change
12 12

  
13 13
------------------------------------- 2.8.2 -------------------------------------
14
13 Jan-2011 Build 1378 Werner von den Decken (DarkViper)
15
! fixed inclusion of SecureForm
16
+ added IDNA/Punycode to  wb::validate_email()
14 17
11 Jan-2011 Build 1377 Frank Heyne (FrankH)
15 18
# Security fix for modules jsadmin, menu_link and output_filter
16 19
11 Jan-2011 Build 1376 Frank Heyne (FrankH)
branches/2.8.x/wb/include/idna_convert/example.php
1
<?php
2
$encoded = $decoded = $add = '';
3
header('Content-Type: text/html; charset=utf-8');
4
require_once('idna_convert.class.php');
5
$IDN = new idna_convert();
6
if (isset($_REQUEST['encode'])) {
7
    $decoded = isset($_REQUEST['decoded']) ? stripslashes($_REQUEST['decoded']) : '';
8
    $encoded = $IDN->encode($decoded);
9
}
10
if (isset($_REQUEST['decode'])) {
11
    $encoded = isset($_REQUEST['encoded']) ? stripslashes($_REQUEST['encoded']) : '';
12
    $decoded = $IDN->decode($encoded);
13
}
14
$lang = 'en';
15
if (isset($_REQUEST['lang'])) {
16
    if ('de' == $_REQUEST['lang'] || 'en' == $_REQUEST['lang']) $lang = $_REQUEST['lang'];
17
    $add .= '<input type="hidden" name="lang" value="'.$_REQUEST['lang'].'" />'."\n";
18
}
19
?>
20
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
21
<html xmlns="http://www.w3.org/1999/xhtml">
22
<head>
23
<title>phlyLabs Punycode Converter</title>
24
<meta name="author" content="phlyLabs" />
25
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
26
<style type="text/css">
27
/*<![CDATA[*/
28
body { color:black;background:white;font-size:10pt;font-family:Verdana,Helvetica,Sans-Serif; }
29
body, form { margin:0; }
30
form { display:inline; }
31
input { font-size:8pt;font-family:Verdana,Helvetica,Sans-Serif; }
32
#round { width:730px;padding:10px;background-color:rgb(230,230,240);border:1px solid black;text-align:center;vertical-align:middle;margin:auto;margin-top:50px; }
33
th { font-size:9pt;font-weight:bold; }
34
#copy { font-size:8pt;color:rgb(60,60,80); }
35
#subhead { font-size:8pt; }
36
#bla { font-size:8pt;text-align:left; }
37
h5 {margin:0;font-size:11pt;font-weight:bold;}
38
/*]]>*/
39
</style>
40
</head>
41
<body>
42
<div id="round">
43
 <h5>phlyLabs' pure PHP IDNA Converter</h5><br />
44
 <span id="subhead">
45
  See <a href="http://faqs.org/rfcs/rfc3490.html" title="IDNA" target="_blank">RFC3490</a>,
46
  <a href="http://faqs.org/rfcs/rfc3491.html" title="Nameprep, a Stringprep profile" target="_blank">RFC3491</a>,
47
  <a href="http://faqs.org/rfcs/rfc3492.html" title="Punycode" target="_blank">RFC3492</a> and
48
  <a href="http://faqs.org/rfcs/rfc3454.html" title="Stringprep" target="_blank">RFC3454</a><br />
49
 </span>
50
 <br />
51
 <div id="bla"><?php if ($lang == 'de') { ?>
52
  Dieser Konverter erlaubt die ?bersetzung von Domainnamen zwischen der Punycode- und der
53
  Unicode-Schreibweise.<br />
54
  Geben Sie einfach den Domainnamen im entsprechend bezeichneten Feld ein und klicken Sie dann auf den darunter
55
  liegenden Button. Sie k?nnen einfache Domainnamen, komplette URLs (wie http://j?rgen-m?ller.de)
56
  oder Emailadressen eingeben.<br />
57
  <br />
58
  Stellen Sie aber sicher, dass Ihr Browser den Zeichensatz <strong>UTF-8</strong> unterst?tzt.<br />
59
  <br />
60
  Wenn Sie Interesse an der zugrundeliegenden PHP-Klasse haben, k?nnen Sie diese
61
  <a href="http://phlymail.com/de/downloads/idna/download/">hier herunterladen</a>.<br />
62
  <br />
63
  Diese Klasse wird ohne Garantie ihrer Funktionst?chtigkeit bereit gestellt. Nutzung auf eigene Gefahr.<br />
64
  Um sicher zu stellen, dass eine Zeichenkette korrekt umgewandelt wurde, sollten Sie diese immer zur?ckwandeln
65
  und das Ergebnis mit Ihrer urspr?nglichen Eingabe vergleichen.<br />
66
  <br />
67
  Fehler und Probleme k?nnen Sie gern an <a href="mailto:team@phlymail.de">team@phlymail.de</a> senden.<br />
68
  <?php } else { ?>
69
  This converter allows you to transfer domain names between the encoded (Punycode) notation
70
  and the decoded (UTF-8) notation.<br />
71
  Just enter the domain name in the respective field and click on the button right below it to have
72
  it converted. Please note, that you might even enter complete domain names (like j&#xFC;rgen-m&#xFC;ller.de)
73
  or a email addresses.<br />
74
  <br />
75
  Make sure, that your browser is capable of the <strong>UTF-8</strong> character encoding.<br />
76
  <br />
77
  For those of you interested in the PHP source of the underlying class, you might
78
  <a href="http://phlymail.com/en/downloads/idna/download/">download it here</a>.<br />
79
  <br />
80
  Please be aware, that this class is provided as is and without any liability. Use at your own risk.<br />
81
  To ensure, that a certain string has been converted correctly, you should convert it both ways and compare the
82
  results.<br />
83
  <br />
84
  Please feel free to report bugs and problems to: <a href="mailto:team@phlymail.com">team@phlymail.com</a>.<br />
85
  <?php } ?>
86
  <br />
87
 </div>
88
 <table border="0" cellpadding="2" cellspacing="2" align="center">
89
  <thead>
90
   <tr>
91
    <th align="left">Original (Unicode)</th>
92
    <th align="right">Punycode (ACE)</th>
93
   </tr>
94
  </thead>
95
  <tbody>
96
   <tr>
97
    <td align="right">
98
     <form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="get">
99
      <input type="text" name="decoded" value="<?php echo htmlentities($decoded, null, 'UTF-8'); ?>" size="48" maxlength="255" /><br />
100
      <input type="submit" name="encode" value="Encode &gt;&gt;" /><?php echo $add; ?>
101
     </form>
102
    </td>
103
    <td align="left">
104
     <form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="get">
105
      <input type="text" name="encoded" value="<?php echo htmlentities($encoded, null, 'UTF-8'); ?>" size="48" maxlength="255" /><br />
106
      <input type="submit" name="decode" value="&lt;&lt; Decode" /><?php echo $add; ?>
107
     </form>
108
    </td>
109
   </tr>
110
  </tbody>
111
 </table>
112
 <br />
113
 <span id="copy">Version used: 0.6.9; &copy; 2004-2010 phlyLabs Berlin; part of <a href="http://phlymail.com/">phlyMail</a></span>
114
</div>
115
</body>
116
</html>
0 117

  
branches/2.8.x/wb/include/idna_convert/LICENCE
1
		  GNU LESSER GENERAL PUBLIC LICENSE
2
		       Version 2.1, February 1999
3

  
4
 Copyright (C) 1991, 1999 Free Software Foundation, Inc.
5
     59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
6
 Everyone is permitted to copy and distribute verbatim copies
7
 of this license document, but changing it is not allowed.
8

  
9
[This is the first released version of the Lesser GPL.  It also counts
10
 as the successor of the GNU Library Public License, version 2, hence
11
 the version number 2.1.]
12

  
13
			    Preamble
14

  
15
  The licenses for most software are designed to take away your
16
freedom to share and change it.  By contrast, the GNU General Public
17
Licenses are intended to guarantee your freedom to share and change
18
free software--to make sure the software is free for all its users.
19

  
20
  This license, the Lesser General Public License, applies to some
21
specially designated software packages--typically libraries--of the
22
Free Software Foundation and other authors who decide to use it.  You
23
can use it too, but we suggest you first think carefully about whether
24
this license or the ordinary General Public License is the better
25
strategy to use in any particular case, based on the explanations below.
26

  
27
  When we speak of free software, we are referring to freedom of use,
28
not price.  Our General Public Licenses are designed to make sure that
29
you have the freedom to distribute copies of free software (and charge
30
for this service if you wish); that you receive source code or can get
31
it if you want it; that you can change the software and use pieces of
32
it in new free programs; and that you are informed that you can do
33
these things.
34

  
35
  To protect your rights, we need to make restrictions that forbid
36
distributors to deny you these rights or to ask you to surrender these
37
rights.  These restrictions translate to certain responsibilities for
38
you if you distribute copies of the library or if you modify it.
39

  
40
  For example, if you distribute copies of the library, whether gratis
41
or for a fee, you must give the recipients all the rights that we gave
42
you.  You must make sure that they, too, receive or can get the source
43
code.  If you link other code with the library, you must provide
44
complete object files to the recipients, so that they can relink them
45
with the library after making changes to the library and recompiling
46
it.  And you must show them these terms so they know their rights.
47

  
48
  We protect your rights with a two-step method: (1) we copyright the
49
library, and (2) we offer you this license, which gives you legal
50
permission to copy, distribute and/or modify the library.
51

  
52
  To protect each distributor, we want to make it very clear that
53
there is no warranty for the free library.  Also, if the library is
54
modified by someone else and passed on, the recipients should know
55
that what they have is not the original version, so that the original
56
author's reputation will not be affected by problems that might be
57
introduced by others.
58

  
59
  Finally, software patents pose a constant threat to the existence of
60
any free program.  We wish to make sure that a company cannot
61
effectively restrict the users of a free program by obtaining a
62
restrictive license from a patent holder.  Therefore, we insist that
63
any patent license obtained for a version of the library must be
64
consistent with the full freedom of use specified in this license.
65

  
66
  Most GNU software, including some libraries, is covered by the
67
ordinary GNU General Public License.  This license, the GNU Lesser
68
General Public License, applies to certain designated libraries, and
69
is quite different from the ordinary General Public License.  We use
70
this license for certain libraries in order to permit linking those
71
libraries into non-free programs.
72

  
73
  When a program is linked with a library, whether statically or using
74
a shared library, the combination of the two is legally speaking a
75
combined work, a derivative of the original library.  The ordinary
76
General Public License therefore permits such linking only if the
77
entire combination fits its criteria of freedom.  The Lesser General
78
Public License permits more lax criteria for linking other code with
79
the library.
80

  
81
  We call this license the "Lesser" General Public License because it
82
does Less to protect the user's freedom than the ordinary General
83
Public License.  It also provides other free software developers Less
84
of an advantage over competing non-free programs.  These disadvantages
85
are the reason we use the ordinary General Public License for many
86
libraries.  However, the Lesser license provides advantages in certain
87
special circumstances.
88

  
89
  For example, on rare occasions, there may be a special need to
90
encourage the widest possible use of a certain library, so that it becomes
91
a de-facto standard.  To achieve this, non-free programs must be
92
allowed to use the library.  A more frequent case is that a free
93
library does the same job as widely used non-free libraries.  In this
94
case, there is little to gain by limiting the free library to free
95
software only, so we use the Lesser General Public License.
96

  
97
  In other cases, permission to use a particular library in non-free
98
programs enables a greater number of people to use a large body of
99
free software.  For example, permission to use the GNU C Library in
100
non-free programs enables many more people to use the whole GNU
101
operating system, as well as its variant, the GNU/Linux operating
102
system.
103

  
104
  Although the Lesser General Public License is Less protective of the
105
users' freedom, it does ensure that the user of a program that is
106
linked with the Library has the freedom and the wherewithal to run
107
that program using a modified version of the Library.
108

  
109
  The precise terms and conditions for copying, distribution and
110
modification follow.  Pay close attention to the difference between a
111
"work based on the library" and a "work that uses the library".  The
112
former contains code derived from the library, whereas the latter must
113
be combined with the library in order to run.
114

  
115
		  GNU LESSER GENERAL PUBLIC LICENSE
116
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
117

  
118
  0. This License Agreement applies to any software library or other
119
program which contains a notice placed by the copyright holder or
120
other authorized party saying it may be distributed under the terms of
121
this Lesser General Public License (also called "this License").
122
Each licensee is addressed as "you".
123

  
124
  A "library" means a collection of software functions and/or data
125
prepared so as to be conveniently linked with application programs
126
(which use some of those functions and data) to form executables.
127

  
128
  The "Library", below, refers to any such software library or work
129
which has been distributed under these terms.  A "work based on the
130
Library" means either the Library or any derivative work under
131
copyright law: that is to say, a work containing the Library or a
132
portion of it, either verbatim or with modifications and/or translated
133
straightforwardly into another language.  (Hereinafter, translation is
134
included without limitation in the term "modification".)
135

  
136
  "Source code" for a work means the preferred form of the work for
137
making modifications to it.  For a library, complete source code means
138
all the source code for all modules it contains, plus any associated
139
interface definition files, plus the scripts used to control compilation
140
and installation of the library.
141

  
142
  Activities other than copying, distribution and modification are not
143
covered by this License; they are outside its scope.  The act of
144
running a program using the Library is not restricted, and output from
145
such a program is covered only if its contents constitute a work based
146
on the Library (independent of the use of the Library in a tool for
147
writing it).  Whether that is true depends on what the Library does
148
and what the program that uses the Library does.
149
  
150
  1. You may copy and distribute verbatim copies of the Library's
151
complete source code as you receive it, in any medium, provided that
152
you conspicuously and appropriately publish on each copy an
153
appropriate copyright notice and disclaimer of warranty; keep intact
154
all the notices that refer to this License and to the absence of any
155
warranty; and distribute a copy of this License along with the
156
Library.
157

  
158
  You may charge a fee for the physical act of transferring a copy,
159
and you may at your option offer warranty protection in exchange for a
160
fee.
161

  
162
  2. You may modify your copy or copies of the Library or any portion
163
of it, thus forming a work based on the Library, and copy and
164
distribute such modifications or work under the terms of Section 1
165
above, provided that you also meet all of these conditions:
166

  
167
    a) The modified work must itself be a software library.
168

  
169
    b) You must cause the files modified to carry prominent notices
170
    stating that you changed the files and the date of any change.
171

  
172
    c) You must cause the whole of the work to be licensed at no
173
    charge to all third parties under the terms of this License.
174

  
175
    d) If a facility in the modified Library refers to a function or a
176
    table of data to be supplied by an application program that uses
177
    the facility, other than as an argument passed when the facility
178
    is invoked, then you must make a good faith effort to ensure that,
179
    in the event an application does not supply such function or
180
    table, the facility still operates, and performs whatever part of
181
    its purpose remains meaningful.
182

  
183
    (For example, a function in a library to compute square roots has
184
    a purpose that is entirely well-defined independent of the
185
    application.  Therefore, Subsection 2d requires that any
186
    application-supplied function or table used by this function must
187
    be optional: if the application does not supply it, the square
188
    root function must still compute square roots.)
189

  
190
These requirements apply to the modified work as a whole.  If
191
identifiable sections of that work are not derived from the Library,
192
and can be reasonably considered independent and separate works in
193
themselves, then this License, and its terms, do not apply to those
194
sections when you distribute them as separate works.  But when you
195
distribute the same sections as part of a whole which is a work based
196
on the Library, the distribution of the whole must be on the terms of
197
this License, whose permissions for other licensees extend to the
198
entire whole, and thus to each and every part regardless of who wrote
199
it.
200

  
201
Thus, it is not the intent of this section to claim rights or contest
202
your rights to work written entirely by you; rather, the intent is to
203
exercise the right to control the distribution of derivative or
204
collective works based on the Library.
205

  
206
In addition, mere aggregation of another work not based on the Library
207
with the Library (or with a work based on the Library) on a volume of
208
a storage or distribution medium does not bring the other work under
209
the scope of this License.
210

  
211
  3. You may opt to apply the terms of the ordinary GNU General Public
212
License instead of this License to a given copy of the Library.  To do
213
this, you must alter all the notices that refer to this License, so
214
that they refer to the ordinary GNU General Public License, version 2,
215
instead of to this License.  (If a newer version than version 2 of the
216
ordinary GNU General Public License has appeared, then you can specify
217
that version instead if you wish.)  Do not make any other change in
218
these notices.
219

  
220
  Once this change is made in a given copy, it is irreversible for
221
that copy, so the ordinary GNU General Public License applies to all
222
subsequent copies and derivative works made from that copy.
223

  
224
  This option is useful when you wish to copy part of the code of
225
the Library into a program that is not a library.
226

  
227
  4. You may copy and distribute the Library (or a portion or
228
derivative of it, under Section 2) in object code or executable form
229
under the terms of Sections 1 and 2 above provided that you accompany
230
it with the complete corresponding machine-readable source code, which
231
must be distributed under the terms of Sections 1 and 2 above on a
232
medium customarily used for software interchange.
233

  
234
  If distribution of object code is made by offering access to copy
235
from a designated place, then offering equivalent access to copy the
236
source code from the same place satisfies the requirement to
237
distribute the source code, even though third parties are not
238
compelled to copy the source along with the object code.
239

  
240
  5. A program that contains no derivative of any portion of the
241
Library, but is designed to work with the Library by being compiled or
242
linked with it, is called a "work that uses the Library".  Such a
243
work, in isolation, is not a derivative work of the Library, and
244
therefore falls outside the scope of this License.
245

  
246
  However, linking a "work that uses the Library" with the Library
247
creates an executable that is a derivative of the Library (because it
248
contains portions of the Library), rather than a "work that uses the
249
library".  The executable is therefore covered by this License.
250
Section 6 states terms for distribution of such executables.
251

  
252
  When a "work that uses the Library" uses material from a header file
253
that is part of the Library, the object code for the work may be a
254
derivative work of the Library even though the source code is not.
255
Whether this is true is especially significant if the work can be
256
linked without the Library, or if the work is itself a library.  The
257
threshold for this to be true is not precisely defined by law.
258

  
259
  If such an object file uses only numerical parameters, data
260
structure layouts and accessors, and small macros and small inline
261
functions (ten lines or less in length), then the use of the object
262
file is unrestricted, regardless of whether it is legally a derivative
263
work.  (Executables containing this object code plus portions of the
264
Library will still fall under Section 6.)
265

  
266
  Otherwise, if the work is a derivative of the Library, you may
267
distribute the object code for the work under the terms of Section 6.
268
Any executables containing that work also fall under Section 6,
269
whether or not they are linked directly with the Library itself.
270

  
271
  6. As an exception to the Sections above, you may also combine or
272
link a "work that uses the Library" with the Library to produce a
273
work containing portions of the Library, and distribute that work
274
under terms of your choice, provided that the terms permit
275
modification of the work for the customer's own use and reverse
276
engineering for debugging such modifications.
277

  
278
  You must give prominent notice with each copy of the work that the
279
Library is used in it and that the Library and its use are covered by
280
this License.  You must supply a copy of this License.  If the work
281
during execution displays copyright notices, you must include the
282
copyright notice for the Library among them, as well as a reference
283
directing the user to the copy of this License.  Also, you must do one
284
of these things:
285

  
286
    a) Accompany the work with the complete corresponding
287
    machine-readable source code for the Library including whatever
288
    changes were used in the work (which must be distributed under
289
    Sections 1 and 2 above); and, if the work is an executable linked
290
    with the Library, with the complete machine-readable "work that
291
    uses the Library", as object code and/or source code, so that the
292
    user can modify the Library and then relink to produce a modified
293
    executable containing the modified Library.  (It is understood
294
    that the user who changes the contents of definitions files in the
295
    Library will not necessarily be able to recompile the application
296
    to use the modified definitions.)
297

  
298
    b) Use a suitable shared library mechanism for linking with the
299
    Library.  A suitable mechanism is one that (1) uses at run time a
300
    copy of the library already present on the user's computer system,
301
    rather than copying library functions into the executable, and (2)
302
    will operate properly with a modified version of the library, if
303
    the user installs one, as long as the modified version is
304
    interface-compatible with the version that the work was made with.
305

  
306
    c) Accompany the work with a written offer, valid for at
307
    least three years, to give the same user the materials
308
    specified in Subsection 6a, above, for a charge no more
309
    than the cost of performing this distribution.
310

  
311
    d) If distribution of the work is made by offering access to copy
312
    from a designated place, offer equivalent access to copy the above
313
    specified materials from the same place.
314

  
315
    e) Verify that the user has already received a copy of these
316
    materials or that you have already sent this user a copy.
317

  
318
  For an executable, the required form of the "work that uses the
319
Library" must include any data and utility programs needed for
320
reproducing the executable from it.  However, as a special exception,
321
the materials to be distributed need not include anything that is
322
normally distributed (in either source or binary form) with the major
323
components (compiler, kernel, and so on) of the operating system on
324
which the executable runs, unless that component itself accompanies
325
the executable.
326

  
327
  It may happen that this requirement contradicts the license
328
restrictions of other proprietary libraries that do not normally
329
accompany the operating system.  Such a contradiction means you cannot
330
use both them and the Library together in an executable that you
331
distribute.
332

  
333
  7. You may place library facilities that are a work based on the
334
Library side-by-side in a single library together with other library
335
facilities not covered by this License, and distribute such a combined
336
library, provided that the separate distribution of the work based on
337
the Library and of the other library facilities is otherwise
338
permitted, and provided that you do these two things:
339

  
340
    a) Accompany the combined library with a copy of the same work
341
    based on the Library, uncombined with any other library
342
    facilities.  This must be distributed under the terms of the
343
    Sections above.
344

  
345
    b) Give prominent notice with the combined library of the fact
346
    that part of it is a work based on the Library, and explaining
347
    where to find the accompanying uncombined form of the same work.
348

  
349
  8. You may not copy, modify, sublicense, link with, or distribute
350
the Library except as expressly provided under this License.  Any
351
attempt otherwise to copy, modify, sublicense, link with, or
352
distribute the Library is void, and will automatically terminate your
353
rights under this License.  However, parties who have received copies,
354
or rights, from you under this License will not have their licenses
355
terminated so long as such parties remain in full compliance.
356

  
357
  9. You are not required to accept this License, since you have not
358
signed it.  However, nothing else grants you permission to modify or
359
distribute the Library or its derivative works.  These actions are
360
prohibited by law if you do not accept this License.  Therefore, by
361
modifying or distributing the Library (or any work based on the
362
Library), you indicate your acceptance of this License to do so, and
363
all its terms and conditions for copying, distributing or modifying
364
the Library or works based on it.
365

  
366
  10. Each time you redistribute the Library (or any work based on the
367
Library), the recipient automatically receives a license from the
368
original licensor to copy, distribute, link with or modify the Library
369
subject to these terms and conditions.  You may not impose any further
370
restrictions on the recipients' exercise of the rights granted herein.
371
You are not responsible for enforcing compliance by third parties with
372
this License.
373

  
374
  11. If, as a consequence of a court judgment or allegation of patent
375
infringement or for any other reason (not limited to patent issues),
376
conditions are imposed on you (whether by court order, agreement or
377
otherwise) that contradict the conditions of this License, they do not
378
excuse you from the conditions of this License.  If you cannot
379
distribute so as to satisfy simultaneously your obligations under this
380
License and any other pertinent obligations, then as a consequence you
381
may not distribute the Library at all.  For example, if a patent
382
license would not permit royalty-free redistribution of the Library by
383
all those who receive copies directly or indirectly through you, then
384
the only way you could satisfy both it and this License would be to
385
refrain entirely from distribution of the Library.
386

  
387
If any portion of this section is held invalid or unenforceable under any
388
particular circumstance, the balance of the section is intended to apply,
389
and the section as a whole is intended to apply in other circumstances.
390

  
391
It is not the purpose of this section to induce you to infringe any
392
patents or other property right claims or to contest validity of any
393
such claims; this section has the sole purpose of protecting the
394
integrity of the free software distribution system which is
395
implemented by public license practices.  Many people have made
396
generous contributions to the wide range of software distributed
397
through that system in reliance on consistent application of that
398
system; it is up to the author/donor to decide if he or she is willing
399
to distribute software through any other system and a licensee cannot
400
impose that choice.
401

  
402
This section is intended to make thoroughly clear what is believed to
403
be a consequence of the rest of this License.
404

  
405
  12. If the distribution and/or use of the Library is restricted in
406
certain countries either by patents or by copyrighted interfaces, the
407
original copyright holder who places the Library under this License may add
408
an explicit geographical distribution limitation excluding those countries,
409
so that distribution is permitted only in or among countries not thus
410
excluded.  In such case, this License incorporates the limitation as if
411
written in the body of this License.
412

  
413
  13. The Free Software Foundation may publish revised and/or new
414
versions of the Lesser General Public License from time to time.
415
Such new versions will be similar in spirit to the present version,
416
but may differ in detail to address new problems or concerns.
417

  
418
Each version is given a distinguishing version number.  If the Library
419
specifies a version number of this License which applies to it and
420
"any later version", you have the option of following the terms and
421
conditions either of that version or of any later version published by
422
the Free Software Foundation.  If the Library does not specify a
423
license version number, you may choose any version ever published by
424
the Free Software Foundation.
425

  
426
  14. If you wish to incorporate parts of the Library into other free
427
programs whose distribution conditions are incompatible with these,
428
write to the author to ask for permission.  For software which is
429
copyrighted by the Free Software Foundation, write to the Free
430
Software Foundation; we sometimes make exceptions for this.  Our
431
decision will be guided by the two goals of preserving the free status
432
of all derivatives of our free software and of promoting the sharing
433
and reuse of software generally.
434

  
435
			    NO WARRANTY
436

  
437
  15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
438
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
439
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
440
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
441
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
442
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
443
PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
444
LIBRARY IS WITH YOU.  SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
445
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
446

  
447
  16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
448
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
449
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
450
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
451
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
452
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
453
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
454
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
455
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
456
DAMAGES.
457

  
458
		     END OF TERMS AND CONDITIONS
459

  
460
           How to Apply These Terms to Your New Libraries
461

  
462
  If you develop a new library, and you want it to be of the greatest
463
possible use to the public, we recommend making it free software that
464
everyone can redistribute and change.  You can do so by permitting
465
redistribution under these terms (or, alternatively, under the terms of the
466
ordinary General Public License).
467

  
468
  To apply these terms, attach the following notices to the library.  It is
469
safest to attach them to the start of each source file to most effectively
470
convey the exclusion of warranty; and each file should have at least the
471
"copyright" line and a pointer to where the full notice is found.
472

  
473
    <one line to give the library's name and a brief idea of what it does.>
474
    Copyright (C) <year>  <name of author>
475

  
476
    This library is free software; you can redistribute it and/or
477
    modify it under the terms of the GNU Lesser General Public
478
    License as published by the Free Software Foundation; either
479
    version 2.1 of the License, or (at your option) any later version.
480

  
481
    This library is distributed in the hope that it will be useful,
482
    but WITHOUT ANY WARRANTY; without even the implied warranty of
483
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
484
    Lesser General Public License for more details.
485

  
486
    You should have received a copy of the GNU Lesser General Public
487
    License along with this library; if not, write to the Free Software
488
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
489

  
490
Also add information on how to contact you by electronic and paper mail.
491

  
492
You should also get your employer (if you work as a programmer) or your
493
school, if any, to sign a "copyright disclaimer" for the library, if
494
necessary.  Here is a sample; alter the names:
495

  
496
  Yoyodyne, Inc., hereby disclaims all copyright interest in the
497
  library `Frob' (a library for tweaking knobs) written by James Random Hacker.
498

  
499
  <signature of Ty Coon>, 1 April 1990
500
  Ty Coon, President of Vice
501

  
502
That's all there is to it!
branches/2.8.x/wb/include/idna_convert/idna_convert.class.php
1
<?php
2
// {{{ license
3

  
4
/* vim: set expandtab tabstop=4 shiftwidth=4 softtabstop=4 foldmethod=marker: */
5
//
6
// +----------------------------------------------------------------------+
7
// | This library is free software; you can redistribute it and/or modify |
8
// | it under the terms of the GNU Lesser General Public License as       |
9
// | published by the Free Software Foundation; either version 2.1 of the |
10
// | License, or (at your option) any later version.                      |
11
// |                                                                      |
12
// | This library is distributed in the hope that it will be useful, but  |
13
// | WITHOUT ANY WARRANTY; without even the implied warranty of           |
14
// | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU    |
15
// | Lesser General Public License for more details.                      |
16
// |                                                                      |
17
// | You should have received a copy of the GNU Lesser General Public     |
18
// | License along with this library; if not, write to the Free Software  |
19
// | Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 |
20
// | USA.                                                                 |
21
// +----------------------------------------------------------------------+
22
//
23

  
24
// }}}
25

  
26
/**
27
 * Encode/decode Internationalized Domain Names.
28
 *
29
 * The class allows to convert internationalized domain names
30
 * (see RFC 3490 for details) as they can be used with various registries worldwide
31
 * to be translated between their original (localized) form and their encoded form
32
 * as it will be used in the DNS (Domain Name System).
33
 *
34
 * The class provides two public methods, encode() and decode(), which do exactly
35
 * what you would expect them to do. You are allowed to use complete domain names,
36
 * simple strings and complete email addresses as well. That means, that you might
37
 * use any of the following notations:
38
 *
39
 * - www.n?rgler.com
40
 * - xn--nrgler-wxa
41
 * - xn--brse-5qa.xn--knrz-1ra.info
42
 *
43
 * Unicode input might be given as either UTF-8 string, UCS-4 string or UCS-4 array.
44
 * Unicode output is available in the same formats.
45
 * You can select your preferred format via {@link set_paramter()}.
46
 *
47
 * ACE input and output is always expected to be ASCII.
48
 *
49
 * @author  Matthias Sommerfeld <mso@phlylabs.de>
50
 * @author  Leonid Kogan <lko@neuse.de>
51
 * @copyright 2004-2010 phlyLabs Berlin, http://phlylabs.de
52
 * @version 0.6.9 2010-11-04
53
 */
54
class idna_convert
55
{
56
    // NP See below
57

  
58
    // Internal settings, do not mess with them
59
    protected $_punycode_prefix = 'xn--';
60
    protected $_invalid_ucs = 0x80000000;
61
    protected $_max_ucs = 0x10FFFF;
62
    protected $_base = 36;
63
    protected $_tmin = 1;
64
    protected $_tmax = 26;
65
    protected $_skew = 38;
66
    protected $_damp = 700;
67
    protected $_initial_bias = 72;
68
    protected $_initial_n = 0x80;
69
    protected $_sbase = 0xAC00;
70
    protected $_lbase = 0x1100;
71
    protected $_vbase = 0x1161;
72
    protected $_tbase = 0x11A7;
73
    protected $_lcount = 19;
74
    protected $_vcount = 21;
75
    protected $_tcount = 28;
76
    protected $_ncount = 588;   // _vcount * _tcount
77
    protected $_scount = 11172; // _lcount * _tcount * _vcount
78
    protected $_error = false;
79

  
80
    // See {@link set_paramter()} for details of how to change the following
81
    // settings from within your script / application
82
    protected $_api_encoding = 'utf8';   // Default input charset is UTF-8
83
    protected $_allow_overlong = false;  // Overlong UTF-8 encodings are forbidden
84
    protected $_strict_mode = false;     // Behave strict or not
85
    protected $_encode_german_sz = true; // True to encode German ?; False, if not
86

  
87
    /**
88
     * the constructor
89
     *
90
     * @param array $options
91
     * @return boolean
92
     * @since 0.5.2
93
     */
94
    public function __construct($options = false)
95
    {
96
        $this->slast = $this->_sbase + $this->_lcount * $this->_vcount * $this->_tcount;
97
        // If parameters are given, pass these to the respective method
98
        if (is_array($options)) return $this->set_parameter($options);
99
        if (!$this->_encode_german_sz) {
100
            $this->NP['replacemaps'][0xDF] = array(0x73, 0x73);
101
        }
102
    }
103

  
104
    /**
105
     * Sets a new option value. Available options and values:
106
     * [encoding - Use either UTF-8, UCS4 as array or UCS4 as string as input ('utf8' for UTF-8,
107
     *         'ucs4_string' and 'ucs4_array' respectively for UCS4); The output is always UTF-8]
108
     * [overlong - Unicode does not allow unnecessarily long encodings of chars,
109
     *             to allow this, set this parameter to true, else to false;
110
     *             default is false.]
111
     * [strict - true: strict mode, good for registration purposes - Causes errors
112
     *           on failures; false: loose mode, ideal for "wildlife" applications
113
     *           by silently ignoring errors and returning the original input instead
114
     *
115
     * @param    mixed     Parameter to set (string: single parameter; array of Parameter => Value pairs)
116
     * @param    string    Value to use (if parameter 1 is a string)
117
     * @return   boolean   true on success, false otherwise
118
     */
119
    public function set_parameter($option, $value = false)
120
    {
121
        if (!is_array($option)) {
122
            $option = array($option => $value);
123
        }
124
        foreach ($option as $k => $v) {
125
            switch ($k) {
126
            case 'encoding':
127
                switch ($v) {
128
                case 'utf8':
129
                case 'ucs4_string':
130
                case 'ucs4_array':
131
                    $this->_api_encoding = $v;
132
                    break;
133
                default:
134
                    $this->_error('Set Parameter: Unknown parameter '.$v.' for option '.$k);
135
                    return false;
136
                }
137
                break;
138
            case 'overlong':
139
                $this->_allow_overlong = ($v) ? true : false;
140
                break;
141
            case 'strict':
142
                $this->_strict_mode = ($v) ? true : false;
143
                break;
144
            case 'encode_german_sz':
145
                $this->_encode_german_sz = ($v) ? true : false;
146
                break;
147
            default:
148
                $this->_error('Set Parameter: Unknown option '.$k);
149
                return false;
150
            }
151
        }
152
        return true;
153
    }
154

  
155
    /**
156
     * Decode a given ACE domain name
157
     * @param    string   Domain name (ACE string)
158
     * [@param    string   Desired output encoding, see {@link set_parameter}]
159
     * @return   string   Decoded Domain name (UTF-8 or UCS-4)
160
     */
161
    public function decode($input, $one_time_encoding = false)
162
    {
163
        // Optionally set
164
        if ($one_time_encoding) {
165
            switch ($one_time_encoding) {
166
            case 'utf8':
167
            case 'ucs4_string':
168
            case 'ucs4_array':
169
                break;
170
            default:
171
                $this->_error('Unknown encoding '.$one_time_encoding);
172
                return false;
173
            }
174
        }
175
        // Make sure to drop any newline characters around
176
        $input = trim($input);
177

  
178
        // Negotiate input and try to determine, whether it is a plain string,
179
        // an email address or something like a complete URL
180
        if (strpos($input, '@')) { // Maybe it is an email address
181
            // No no in strict mode
182
            if ($this->_strict_mode) {
183
                $this->_error('Only simple domain name parts can be handled in strict mode');
184
                return false;
185
            }
186
            list ($email_pref, $input) = explode('@', $input, 2);
187
            $arr = explode('.', $input);
188
            foreach ($arr as $k => $v) {
189
                if (preg_match('!^'.preg_quote($this->_punycode_prefix, '!').'!', $v)) {
190
                    $conv = $this->_decode($v);
191
                    if ($conv) $arr[$k] = $conv;
192
                }
193
            }
194
            $input = join('.', $arr);
195
            $arr = explode('.', $email_pref);
196
            foreach ($arr as $k => $v) {
197
                if (preg_match('!^'.preg_quote($this->_punycode_prefix, '!').'!', $v)) {
198
                    $conv = $this->_decode($v);
199
                    if ($conv) $arr[$k] = $conv;
200
                }
201
            }
202
            $email_pref = join('.', $arr);
203
            $return = $email_pref . '@' . $input;
204
        } elseif (preg_match('![:\./]!', $input)) { // Or a complete domain name (with or without paths / parameters)
205
            // No no in strict mode
206
            if ($this->_strict_mode) {
207
                $this->_error('Only simple domain name parts can be handled in strict mode');
208
                return false;
209
            }
210
            $parsed = parse_url($input);
211
            if (isset($parsed['host'])) {
212
                $arr = explode('.', $parsed['host']);
213
                foreach ($arr as $k => $v) {
214
                    $conv = $this->_decode($v);
215
                    if ($conv) $arr[$k] = $conv;
216
                }
217
                $parsed['host'] = join('.', $arr);
218
                $return =
219
                        (empty($parsed['scheme']) ? '' : $parsed['scheme'].(strtolower($parsed['scheme']) == 'mailto' ? ':' : '://'))
220
                        .(empty($parsed['user']) ? '' : $parsed['user'].(empty($parsed['pass']) ? '' : ':'.$parsed['pass']).'@')
221
                        .$parsed['host']
222
                        .(empty($parsed['port']) ? '' : ':'.$parsed['port'])
223
                        .(empty($parsed['path']) ? '' : $parsed['path'])
224
                        .(empty($parsed['query']) ? '' : '?'.$parsed['query'])
225
                        .(empty($parsed['fragment']) ? '' : '#'.$parsed['fragment']);
226
            } else { // parse_url seems to have failed, try without it
227
                $arr = explode('.', $input);
228
                foreach ($arr as $k => $v) {
229
                    $conv = $this->_decode($v);
230
                    $arr[$k] = ($conv) ? $conv : $v;
231
                }
232
                $return = join('.', $arr);
233
            }
234
        } else { // Otherwise we consider it being a pure domain name string
235
            $return = $this->_decode($input);
236
            if (!$return) $return = $input;
237
        }
238
        // The output is UTF-8 by default, other output formats need conversion here
239
        // If one time encoding is given, use this, else the objects property
240
        switch (($one_time_encoding) ? $one_time_encoding : $this->_api_encoding) {
241
        case 'utf8':
242
            return $return;
243
            break;
244
        case 'ucs4_string':
245
           return $this->_ucs4_to_ucs4_string($this->_utf8_to_ucs4($return));
246
           break;
247
        case 'ucs4_array':
248
            return $this->_utf8_to_ucs4($return);
249
            break;
250
        default:
251
            $this->_error('Unsupported output format');
252
            return false;
253
        }
254
    }
255

  
256
    /**
257
     * Encode a given UTF-8 domain name
258
     * @param    string   Domain name (UTF-8 or UCS-4)
259
     * [@param    string   Desired input encoding, see {@link set_parameter}]
260
     * @return   string   Encoded Domain name (ACE string)
261
     */
262
    public function encode($decoded, $one_time_encoding = false)
263
    {
264
        // Forcing conversion of input to UCS4 array
265
        // If one time encoding is given, use this, else the objects property
266
        switch ($one_time_encoding ? $one_time_encoding : $this->_api_encoding) {
267
        case 'utf8':
268
            $decoded = $this->_utf8_to_ucs4($decoded);
269
            break;
270
        case 'ucs4_string':
271
           $decoded = $this->_ucs4_string_to_ucs4($decoded);
272
        case 'ucs4_array':
273
           break;
274
        default:
275
            $this->_error('Unsupported input format: '.($one_time_encoding ? $one_time_encoding : $this->_api_encoding));
276
            return false;
277
        }
278

  
279
        // No input, no output, what else did you expect?
280
        if (empty($decoded)) return '';
281

  
282
        // Anchors for iteration
283
        $last_begin = 0;
284
        // Output string
285
        $output = '';
286
        foreach ($decoded as $k => $v) {
287
            // Make sure to use just the plain dot
288
            switch($v) {
289
            case 0x3002:
290
            case 0xFF0E:
291
            case 0xFF61:
292
                $decoded[$k] = 0x2E;
293
                // Right, no break here, the above are converted to dots anyway
294
            // Stumbling across an anchoring character
295
            case 0x2E:
296
            case 0x2F:
297
            case 0x3A:
298
            case 0x3F:
299
            case 0x40:
300
                // Neither email addresses nor URLs allowed in strict mode
301
                if ($this->_strict_mode) {
302
                   $this->_error('Neither email addresses nor URLs are allowed in strict mode.');
303
                   return false;
304
                } else {
305
                    // Skip first char
306
                    if ($k) {
307
                        $encoded = '';
308
                        $encoded = $this->_encode(array_slice($decoded, $last_begin, (($k)-$last_begin)));
309
                        if ($encoded) {
310
                            $output .= $encoded;
311
                        } else {
312
                            $output .= $this->_ucs4_to_utf8(array_slice($decoded, $last_begin, (($k)-$last_begin)));
313
                        }
314
                        $output .= chr($decoded[$k]);
315
                    }
316
                    $last_begin = $k + 1;
317
                }
318
            }
319
        }
320
        // Catch the rest of the string
321
        if ($last_begin) {
322
            $inp_len = sizeof($decoded);
323
            $encoded = '';
324
            $encoded = $this->_encode(array_slice($decoded, $last_begin, (($inp_len)-$last_begin)));
325
            if ($encoded) {
326
                $output .= $encoded;
327
            } else {
328
                $output .= $this->_ucs4_to_utf8(array_slice($decoded, $last_begin, (($inp_len)-$last_begin)));
329
            }
330
            return $output;
331
        } else {
332
            if ($output = $this->_encode($decoded)) {
333
                return $output;
334
            } else {
335
                return $this->_ucs4_to_utf8($decoded);
336
            }
337
        }
338
    }
339

  
340
    /**
341
     * Removes a weakness of encode(), which cannot properly handle URIs but instead encodes their
342
     * path or query components, too.
343
     * @param string  $uri  Expects the URI as a UTF-8 (or ASCII) string
344
     * @return  string  The URI encoded to Punycode, everything but the host component is left alone
345
     * @since 0.6.4
346
     */
347
    public function encode_uri($uri)
348
    {
349
        $parsed = parse_url($uri);
350
        if (!isset($parsed['host'])) {
351
            $this->_error('The given string does not look like a URI');
352
            return false;
353
        }
354
        $arr = explode('.', $parsed['host']);
355
        foreach ($arr as $k => $v) {
356
            $conv = $this->encode($v, 'utf8');
357
            if ($conv) $arr[$k] = $conv;
358
        }
359
        $parsed['host'] = join('.', $arr);
360
        $return =
361
                (empty($parsed['scheme']) ? '' : $parsed['scheme'].(strtolower($parsed['scheme']) == 'mailto' ? ':' : '://'))
362
                .(empty($parsed['user']) ? '' : $parsed['user'].(empty($parsed['pass']) ? '' : ':'.$parsed['pass']).'@')
363
                .$parsed['host']
364
                .(empty($parsed['port']) ? '' : ':'.$parsed['port'])
365
                .(empty($parsed['path']) ? '' : $parsed['path'])
366
                .(empty($parsed['query']) ? '' : '?'.$parsed['query'])
367
                .(empty($parsed['fragment']) ? '' : '#'.$parsed['fragment']);
368
        return $return;
369
    }
370

  
371
    /**
372
     * Use this method to get the last error ocurred
373
     * @param    void
374
     * @return   string   The last error, that occured
375
     */
376
    public function get_last_error()
377
    {
378
        return $this->_error;
379
    }
380

  
381
    /**
382
     * The actual decoding algorithm
383
     * @param string
384
     * @return mixed
385
     */
386
    protected function _decode($encoded)
387
    {
388
        $decoded = array();
389
        // find the Punycode prefix
390
        if (!preg_match('!^'.preg_quote($this->_punycode_prefix, '!').'!', $encoded)) {
391
            $this->_error('This is not a punycode string');
392
            return false;
393
        }
394
        $encode_test = preg_replace('!^'.preg_quote($this->_punycode_prefix, '!').'!', '', $encoded);
395
        // If nothing left after removing the prefix, it is hopeless
396
        if (!$encode_test) {
397
            $this->_error('The given encoded string was empty');
398
            return false;
399
        }
400
        // Find last occurence of the delimiter
401
        $delim_pos = strrpos($encoded, '-');
402
        if ($delim_pos > strlen($this->_punycode_prefix)) {
403
            for ($k = strlen($this->_punycode_prefix); $k < $delim_pos; ++$k) {
404
                $decoded[] = ord($encoded{$k});
405
            }
406
        }
407
        $deco_len = count($decoded);
408
        $enco_len = strlen($encoded);
409

  
410
        // Wandering through the strings; init
411
        $is_first = true;
412
        $bias = $this->_initial_bias;
413
        $idx = 0;
414
        $char = $this->_initial_n;
415

  
416
        for ($enco_idx = ($delim_pos) ? ($delim_pos + 1) : 0; $enco_idx < $enco_len; ++$deco_len) {
417
            for ($old_idx = $idx, $w = 1, $k = $this->_base; 1 ; $k += $this->_base) {
418
                $digit = $this->_decode_digit($encoded{$enco_idx++});
419
                $idx += $digit * $w;
420
                $t = ($k <= $bias) ? $this->_tmin :
421
                        (($k >= $bias + $this->_tmax) ? $this->_tmax : ($k - $bias));
422
                if ($digit < $t) break;
423
                $w = (int) ($w * ($this->_base - $t));
424
            }
425
            $bias = $this->_adapt($idx - $old_idx, $deco_len + 1, $is_first);
426
            $is_first = false;
427
            $char += (int) ($idx / ($deco_len + 1));
428
            $idx %= ($deco_len + 1);
429
            if ($deco_len > 0) {
430
                // Make room for the decoded char
431
                for ($i = $deco_len; $i > $idx; $i--) $decoded[$i] = $decoded[($i - 1)];
432
            }
433
            $decoded[$idx++] = $char;
434
        }
435
        return $this->_ucs4_to_utf8($decoded);
436
    }
437

  
438
    /**
439
     * The actual encoding algorithm
440
     * @param  string
441
     * @return mixed
442
     */
443
    protected function _encode($decoded)
444
    {
445
        // We cannot encode a domain name containing the Punycode prefix
446
        $extract = strlen($this->_punycode_prefix);
447
        $check_pref = $this->_utf8_to_ucs4($this->_punycode_prefix);
448
        $check_deco = array_slice($decoded, 0, $extract);
449

  
450
        if ($check_pref == $check_deco) {
451
            $this->_error('This is already a punycode string');
452
            return false;
453
        }
454
        // We will not try to encode strings consisting of basic code points only
455
        $encodable = false;
456
        foreach ($decoded as $k => $v) {
457
            if ($v > 0x7a) {
458
                $encodable = true;
459
                break;
460
            }
461
        }
462
        if (!$encodable) {
463
            $this->_error('The given string does not contain encodable chars');
464
            return false;
465
        }
466
        // Do NAMEPREP
467
        $decoded = $this->_nameprep($decoded);
468
        if (!$decoded || !is_array($decoded)) return false; // NAMEPREP failed
469
        $deco_len  = count($decoded);
470
        if (!$deco_len) return false; // Empty array
471
        $codecount = 0; // How many chars have been consumed
472
        $encoded = '';
473
        // Copy all basic code points to output
474
        for ($i = 0; $i < $deco_len; ++$i) {
475
            $test = $decoded[$i];
476
            // Will match [-0-9a-zA-Z]
477
            if ((0x2F < $test && $test < 0x40) || (0x40 < $test && $test < 0x5B)
478
                    || (0x60 < $test && $test <= 0x7B) || (0x2D == $test)) {
479
                $encoded .= chr($decoded[$i]);
480
                $codecount++;
481
            }
482
        }
483
        if ($codecount == $deco_len) return $encoded; // All codepoints were basic ones
484

  
485
        // Start with the prefix; copy it to output
486
        $encoded = $this->_punycode_prefix.$encoded;
487
        // If we have basic code points in output, add an hyphen to the end
488
        if ($codecount) $encoded .= '-';
489
        // Now find and encode all non-basic code points
490
        $is_first = true;
491
        $cur_code = $this->_initial_n;
492
        $bias = $this->_initial_bias;
493
        $delta = 0;
494
        while ($codecount < $deco_len) {
495
            // Find the smallest code point >= the current code point and
496
            // remember the last ouccrence of it in the input
497
            for ($i = 0, $next_code = $this->_max_ucs; $i < $deco_len; $i++) {
498
                if ($decoded[$i] >= $cur_code && $decoded[$i] <= $next_code) {
499
                    $next_code = $decoded[$i];
500
                }
501
            }
502
            $delta += ($next_code - $cur_code) * ($codecount + 1);
503
            $cur_code = $next_code;
504

  
505
            // Scan input again and encode all characters whose code point is $cur_code
506
            for ($i = 0; $i < $deco_len; $i++) {
507
                if ($decoded[$i] < $cur_code) {
508
                    $delta++;
509
                } elseif ($decoded[$i] == $cur_code) {
510
                    for ($q = $delta, $k = $this->_base; 1; $k += $this->_base) {
511
                        $t = ($k <= $bias) ? $this->_tmin :
512
                                (($k >= $bias + $this->_tmax) ? $this->_tmax : $k - $bias);
513
                        if ($q < $t) break;
514
                        $encoded .= $this->_encode_digit(intval($t + (($q - $t) % ($this->_base - $t)))); //v0.4.5 Changed from ceil() to intval()
515
                        $q = (int) (($q - $t) / ($this->_base - $t));
516
                    }
517
                    $encoded .= $this->_encode_digit($q);
518
                    $bias = $this->_adapt($delta, $codecount+1, $is_first);
519
                    $codecount++;
520
                    $delta = 0;
521
                    $is_first = false;
522
                }
523
            }
524
            $delta++;
525
            $cur_code++;
526
        }
527
        return $encoded;
528
    }
529

  
530
    /**
531
     * Adapt the bias according to the current code point and position
532
     * @param int $delta
533
     * @param int $npoints
534
     * @param int $is_first
535
     * @return int
536
     */
537
    protected function _adapt($delta, $npoints, $is_first)
538
    {
539
        $delta = intval($is_first ? ($delta / $this->_damp) : ($delta / 2));
540
        $delta += intval($delta / $npoints);
541
        for ($k = 0; $delta > (($this->_base - $this->_tmin) * $this->_tmax) / 2; $k += $this->_base) {
542
            $delta = intval($delta / ($this->_base - $this->_tmin));
543
        }
544
        return intval($k + ($this->_base - $this->_tmin + 1) * $delta / ($delta + $this->_skew));
545
    }
546

  
547
    /**
548
     * Encoding a certain digit
549
     * @param    int $d
550
     * @return string
551
     */
552
    protected function _encode_digit($d)
553
    {
554
        return chr($d + 22 + 75 * ($d < 26));
555
    }
556

  
557
    /**
558
     * Decode a certain digit
559
     * @param    int $cp
560
     * @return int
561
     */
562
    protected function _decode_digit($cp)
563
    {
564
        $cp = ord($cp);
565
        return ($cp - 48 < 10) ? $cp - 22 : (($cp - 65 < 26) ? $cp - 65 : (($cp - 97 < 26) ? $cp - 97 : $this->_base));
566
    }
567

  
568
    /**
569
     * Internal error handling method
570
     * @param  string $error
571
     */
572
    protected function _error($error = '')
573
    {
574
        $this->_error = $error;
575
    }
576

  
577
    /**
578
     * Do Nameprep according to RFC3491 and RFC3454
579
     * @param    array    Unicode Characters
580
     * @return   string   Unicode Characters, Nameprep'd
581
     */
582
    protected function _nameprep($input)
583
    {
584
        $output = array();
585
        $error = false;
586
        //
587
        // Mapping
588
        // Walking through the input array, performing the required steps on each of
589
        // the input chars and putting the result into the output array
590
        // While mapping required chars we apply the cannonical ordering
591
        foreach ($input as $v) {
592
            // Map to nothing == skip that code point
593
            if (in_array($v, $this->NP['map_nothing'])) continue;
594
            // Try to find prohibited input
595
            if (in_array($v, $this->NP['prohibit']) || in_array($v, $this->NP['general_prohibited'])) {
596
                $this->_error('NAMEPREP: Prohibited input U+'.sprintf('%08X', $v));
597
                return false;
598
            }
599
            foreach ($this->NP['prohibit_ranges'] as $range) {
600
                if ($range[0] <= $v && $v <= $range[1]) {
601
                    $this->_error('NAMEPREP: Prohibited input U+'.sprintf('%08X', $v));
602
                    return false;
603
                }
604
            }
605
            // Hangul syllable decomposition
606
            if (0xAC00 <= $v && $v <= 0xD7AF) {
607
                foreach ($this->_hangul_decompose($v) as $out) $output[] = (int) $out;
608
            // There's a decomposition mapping for that code point
609
            } elseif (isset($this->NP['replacemaps'][$v])) {
610
                foreach ($this->_apply_cannonical_ordering($this->NP['replacemaps'][$v]) as $out) {
611
                    $output[] = (int) $out;
612
                }
613
            } else {
614
                $output[] = (int) $v;
615
            }
616
        }
617
        // Before applying any Combining, try to rearrange any Hangul syllables
618
        $output = $this->_hangul_compose($output);
619
        //
620
        // Combine code points
621
        //
622
        $last_class = 0;
623
        $last_starter = 0;
624
        $out_len = count($output);
625
        for ($i = 0; $i < $out_len; ++$i) {
626
            $class = $this->_get_combining_class($output[$i]);
627
            if ((!$last_class || $last_class > $class) && $class) {
628
                // Try to match
629
                $seq_len = $i - $last_starter;
630
                $out = $this->_combine(array_slice($output, $last_starter, $seq_len));
631
                // On match: Replace the last starter with the composed character and remove
632
                // the now redundant non-starter(s)
633
                if ($out) {
634
                    $output[$last_starter] = $out;
635
                    if (count($out) != $seq_len) {
636
                        for ($j = $i+1; $j < $out_len; ++$j) $output[$j-1] = $output[$j];
637
                        unset($output[$out_len]);
638
                    }
639
                    // Rewind the for loop by one, since there can be more possible compositions
640
                    $i--;
641
                    $out_len--;
642
                    $last_class = ($i == $last_starter) ? 0 : $this->_get_combining_class($output[$i-1]);
643
                    continue;
644
                }
645
            }
646
            // The current class is 0
647
            if (!$class) $last_starter = $i;
648
            $last_class = $class;
649
        }
650
        return $output;
651
    }
652

  
653
    /**
654
     * Decomposes a Hangul syllable
655
     * (see http://www.unicode.org/unicode/reports/tr15/#Hangul
656
     * @param    integer  32bit UCS4 code point
657
     * @return   array    Either Hangul Syllable decomposed or original 32bit value as one value array
658
     */
659
    protected function _hangul_decompose($char)
660
    {
661
        $sindex = (int) $char - $this->_sbase;
662
        if ($sindex < 0 || $sindex >= $this->_scount) return array($char);
663
        $result = array();
664
        $result[] = (int) $this->_lbase + $sindex / $this->_ncount;
665
        $result[] = (int) $this->_vbase + ($sindex % $this->_ncount) / $this->_tcount;
666
        $T = intval($this->_tbase + $sindex % $this->_tcount);
667
        if ($T != $this->_tbase) $result[] = $T;
668
        return $result;
669
    }
670
    /**
671
     * Ccomposes a Hangul syllable
672
     * (see http://www.unicode.org/unicode/reports/tr15/#Hangul
673
     * @param    array    Decomposed UCS4 sequence
674
     * @return   array    UCS4 sequence with syllables composed
675
     */
676
    protected function _hangul_compose($input)
677
    {
678
        $inp_len = count($input);
679
        if (!$inp_len) return array();
680
        $result = array();
681
        $last = (int) $input[0];
682
        $result[] = $last; // copy first char from input to output
683

  
684
        for ($i = 1; $i < $inp_len; ++$i) {
685
            $char = (int) $input[$i];
686
            $sindex = $last - $this->_sbase;
687
            $lindex = $last - $this->_lbase;
688
            $vindex = $char - $this->_vbase;
689
            $tindex = $char - $this->_tbase;
690
            // Find out, whether two current characters are LV and T
691
            if (0 <= $sindex && $sindex < $this->_scount && ($sindex % $this->_tcount == 0)
692
                    && 0 <= $tindex && $tindex <= $this->_tcount) {
693
                // create syllable of form LVT
694
                $last += $tindex;
695
                $result[(count($result) - 1)] = $last; // reset last
696
                continue; // discard char
697
            }
698
            // Find out, whether two current characters form L and V
699
            if (0 <= $lindex && $lindex < $this->_lcount && 0 <= $vindex && $vindex < $this->_vcount) {
700
                // create syllable of form LV
701
                $last = (int) $this->_sbase + ($lindex * $this->_vcount + $vindex) * $this->_tcount;
702
                $result[(count($result) - 1)] = $last; // reset last
703
                continue; // discard char
704
            }
705
            // if neither case was true, just add the character
706
            $last = $char;
707
            $result[] = $char;
708
        }
709
        return $result;
710
    }
711

  
712
    /**
713
     * Returns the combining class of a certain wide char
714
     * @param    integer    Wide char to check (32bit integer)
715
     * @return   integer    Combining class if found, else 0
716
     */
717
    protected function _get_combining_class($char)
718
    {
719
        return isset($this->NP['norm_combcls'][$char]) ? $this->NP['norm_combcls'][$char] : 0;
720
    }
721

  
722
    /**
723
     * Apllies the cannonical ordering of a decomposed UCS4 sequence
724
     * @param    array      Decomposed UCS4 sequence
725
     * @return   array      Ordered USC4 sequence
726
     */
727
    protected function _apply_cannonical_ordering($input)
728
    {
729
        $swap = true;
730
        $size = count($input);
731
        while ($swap) {
732
            $swap = false;
733
            $last = $this->_get_combining_class(intval($input[0]));
734
            for ($i = 0; $i < $size-1; ++$i) {
735
                $next = $this->_get_combining_class(intval($input[$i+1]));
736
                if ($next != 0 && $last > $next) {
737
                    // Move item leftward until it fits
738
                    for ($j = $i + 1; $j > 0; --$j) {
739
                        if ($this->_get_combining_class(intval($input[$j-1])) <= $next) break;
740
                        $t = intval($input[$j]);
741
                        $input[$j] = intval($input[$j-1]);
742
                        $input[$j-1] = $t;
743
                        $swap = true;
744
                    }
745
                    // Reentering the loop looking at the old character again
746
                    $next = $last;
747
                }
748
                $last = $next;
749
            }
750
        }
751
        return $input;
752
    }
753

  
754
    /**
755
     * Do composition of a sequence of starter and non-starter
756
     * @param    array      UCS4 Decomposed sequence
757
     * @return   array      Ordered USC4 sequence
758
     */
759
    protected function _combine($input)
760
    {
761
        $inp_len = count($input);
762
        foreach ($this->NP['replacemaps'] as $np_src => $np_target) {
763
            if ($np_target[0] != $input[0]) continue;
764
            if (count($np_target) != $inp_len) continue;
765
            $hit = false;
766
            foreach ($input as $k2 => $v2) {
767
                if ($v2 == $np_target[$k2]) {
768
                    $hit = true;
769
                } else {
770
                    $hit = false;
771
                    break;
772
                }
773
            }
774
            if ($hit) return $np_src;
775
        }
776
        return false;
777
    }
778

  
779
    /**
780
     * This converts an UTF-8 encoded string to its UCS-4 representation
781
     * By talking about UCS-4 "strings" we mean arrays of 32bit integers representing
782
     * each of the "chars". This is due to PHP not being able to handle strings with
783
     * bit depth different from 8. This apllies to the reverse method _ucs4_to_utf8(), too.
784
     * The following UTF-8 encodings are supported:
785
     * bytes bits  representation
786
     * 1        7  0xxxxxxx
787
     * 2       11  110xxxxx 10xxxxxx
788
     * 3       16  1110xxxx 10xxxxxx 10xxxxxx
789
     * 4       21  11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
790
     * 5       26  111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
791
     * 6       31  1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
792
     * Each x represents a bit that can be used to store character data.
793
     * The five and six byte sequences are part of Annex D of ISO/IEC 10646-1:2000
794
     * @param string $input
795
     * @return string
796
     */
797
    protected function _utf8_to_ucs4($input)
798
    {
799
        $output = array();
800
        $out_len = 0;
801
        // Patch by Daniel Hahler; work around prolbem with mbstring.func_overload
802
        if (function_exists('mb_strlen')) {
803
            $inp_len = mb_strlen($input, '8bit');
804
        } else {
805
            $inp_len = strlen($input);
806
        }
807
        $mode = 'next';
808
        $test = 'none';
809
        for ($k = 0; $k < $inp_len; ++$k) {
810
            $v = ord($input{$k}); // Extract byte from input string
811
            if ($v < 128) { // We found an ASCII char - put into stirng as is
812
                $output[$out_len] = $v;
813
                ++$out_len;
814
                if ('add' == $mode) {
815
                    $this->_error('Conversion from UTF-8 to UCS-4 failed: malformed input at byte '.$k);
816
                    return false;
817
                }
818
                continue;
819
            }
820
            if ('next' == $mode) { // Try to find the next start byte; determine the width of the Unicode char
821
                $start_byte = $v;
822
                $mode = 'add';
823
                $test = 'range';
824
                if ($v >> 5 == 6) { // &110xxxxx 10xxxxx
825
                    $next_byte = 0; // Tells, how many times subsequent bitmasks must rotate 6bits to the left
826
                    $v = ($v - 192) << 6;
827
                } elseif ($v >> 4 == 14) { // &1110xxxx 10xxxxxx 10xxxxxx
828
                    $next_byte = 1;
829
                    $v = ($v - 224) << 12;
830
                } elseif ($v >> 3 == 30) { // &11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
831
                    $next_byte = 2;
832
                    $v = ($v - 240) << 18;
833
                } elseif ($v >> 2 == 62) { // &111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
834
                    $next_byte = 3;
835
                    $v = ($v - 248) << 24;
836
                } elseif ($v >> 1 == 126) { // &1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
837
                    $next_byte = 4;
838
                    $v = ($v - 252) << 30;
... This diff was truncated because it exceeds the maximum size that can be displayed.

Also available in: Unified diff