Username considerations
Survey on different systems
URL encoding
Since usernames are submitted as URL paramaters, they must be URL-encodable.
HTTP Basic authentication
Usernames must not contain a single colon (":") or ASCII control characters (0-31, 127). Additionally, the sequence "CR LF" followed by either a Space or a Horizontal Space is also a valid sequence.
For more information, please refer to Section 2, Basic Authentication Scheme of RFC2617<ref>RFC2617 - HTTP Authentication: Basic and Digest Access Authentication</ref> for detailed information.
MediaWiki
In order for a username to be legal, the username must be a valid page title (For the user page). Illegal characters for page titles are<ref>http://meta.wikimedia.org/wiki/Help:Page_Name#Restrictions</ref>
#<>[]|{}
Certain character sequences ("/..", etc.) are also illegal<ref>http://en.wikipedia.org/wiki/Wikipedia:Naming_conventions_%28technical_restrictions%29</ref> but are already filtered by the general check.
Certain special usernames are also filtered: Characters that look like IPv6 or IPv6 characters and interesting Unicode characters, see User::isValidUserName() in includes/User.php:
$unicodeBlacklist = '/[' .
'\x{0080}-\x{009f}' . # iso-8859-1 control chars
'\x{00a0}' . # non-breaking space
'\x{2000}-\x{200f}' . # various whitespace
'\x{2028}-\x{202f}' . # breaks and control chars
'\x{3000}' . # ideographic space
'\x{e000}-\x{f8ff}' . # private use
']/u';
Additionally, MediaWiki filters new usernames that contain an "@" sign<ref>See MediaWiki variable $wgInvalidUsernameCharacters</ref>.
The maximum username length is 255 bytes (which is less then 255 characters, if you use multibyte characters). This is checked both by $wgMaxNameChars and by page name restrictions.
Drupal
(fsinf.at)
http://api.drupal.org/api/drupal/modules--user--user.module/function/user_validate_name/6
vBulletin
(informatik-forum.at)
phpBB
" and " are always invalid.
By default, usernames are restricted to 'USERNAME_CHARS_ANY', which expands to the regex "^.+$"
Wordpress
XMPP
XMPP allows usernames in unicode, but some characters are forbidden:
"&'/:<>@
... as well as:
- ASCII Space characters
- Non-ASCII Space characters
- ASCII Control characters
- Non-ASCII Control characters
Linux system accounts
On Debian, usernames must neither start with a dash ("-") nor contain a colon (":") or any type of whitespace (" ", newline, etc.) and usernames may be up to 32 characters long<ref>see man useradd</ref>. Apparently, a slash ("/") is even a valid username, but you don't want this.
Windows system accounts
Windows has several reserved names that cannot be a valid username<ref>http://support.microsoft.com/kb/909264</ref>. Additionally, some characters are invalid:
"/\[]:;|=,+*?<>
Otherwise, usernames can contain all other special characters, including spaces, periods, dashes, and underscores<ref>http://technet.microsoft.com/en-us/library/bb726984.aspx</ref>.
Wikipedia provides good reference<ref>http://en.wikipedia.org/wiki/Email_address</ref>:
The local-part of the e-mail address may use any of these ASCII characters:
- Uppercase and lowercase English letters (a–z, A–Z)
- Digits
0
to9
- Characters
! # $ % & ' * + - / = ? ^ _ ` { | } ~
- Character
.
(dot, period, full stop) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively (e.g. John..Doe@example.com).
Summary
ASCII table
char | URLs | HTTP basic auth | MediaWiki | Drupal | vBulletin | phpBB | WordPress | XMPP | Linux | Windows | RestAuth | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Y | Y | Y | N | N | N | |||||||
! | Y | Y | Y | Y<ref group="n" name="linux-not-recommended">Not recommended</ref> | Y | |||||||
" | Y | Y | N | N | Y<ref group="n" name="linux-not-recommended" /> | N | ||||||
# | Y | Y | N | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | ||||||
$ | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | |||||||
% | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | |||||||
& | Y | Y | N | Y<ref group="n" name="linux-not-recommended" /> | Y | |||||||
' | Y | Y | N | Y<ref group="n" name="linux-not-recommended" /> | Y | |||||||
( | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | N | |||||||
) | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | N | |||||||
* | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | |||||||
+ | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y<ref group="n">often used as tag delimiter</ref> | |||||||
, | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | N | |||||||
- | Y | Y | Y | Y<ref group="n" name="must-not-start">must not start with this character</ref> | Y | |||||||
. | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y<ref group="n">not the first or last character, must not appear twice in a row</ref> | |||||||
/ | Y<ref group="n">Only theoretically supported, using it works very different depending on Django setup.</ref> | Y | N | N | Y | N | ||||||
[0-9] | Y | Y | Y | Y<ref group="n">Not recommended at start of username</ref> | Y | |||||||
: | Y | N | Y<ref group="n" name="must-not-start" /><ref group="n">If the part before the ':' collides with a namespace or interwiki prefix, it is illegal to</ref> | N | N | N | ||||||
; | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | N | |||||||
< | Y | Y | N | N | Y<ref group="n" name="linux-not-recommended" /> | N | ||||||
= | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | |||||||
> | Y | Y | N | N | Y<ref group="n" name="linux-not-recommended" /> | N | ||||||
? | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | |||||||
@ | Y | Y | Y<ref group="n">Blocked during account creation, see #MediaWiki.</ref> | N | Y<ref group="n" name="linux-not-recommended" /> | N | ||||||
[A-Z] | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y<ref group="n">In practice, many systems are case insensitive</ref> | |||||||
[ | Y | Y | N | Y | Y<ref group="n" name="linux-not-recommended" /> | N | ||||||
\ | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | N | |||||||
] | Y | Y | N | Y | Y<ref group="n" name="linux-not-recommended" /> | N | ||||||
^ | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | |||||||
_ | Y | Y | Y | Y | Y | |||||||
` | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | |||||||
[a-z] | Y | Y | Y | Y | Y | |||||||
{ | Y | Y | N | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | ||||||
| | Y | Y | N | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | ||||||
} | Y | Y | N | Y | Y<ref group="n" name="linux-not-recommended" /> | Y | ||||||
~ | Y | Y | Y | Y<ref group="n" name="linux-not-recommended" /> | Y |
<references group="n" />
Minimum/Maximum username length
URLs | HTTP basic auth | MediaWiki | Drupal | vBulletin | phpBB | WordPress | XMPP | Linux | Windows | RestAuth | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|
min | 1 | 1 | 1 | |||||||||
max | -<ref group="m">dependent on the webserver</ref> | - | 255 bytes<ref group="m">http://meta.wikimedia.org/wiki/Help:Page_Name#Maximum_page_name_length</ref> | 32 |
<references group="m" />
Conclusions
- Names containing slashes ('/'), colons (':') and backslashes ("\") are illegal, no matter what. This makes our whole life easier.
- Names containing ASCII control characters (<= 31, NOT space!) and DEL (#127) are also illegal. Note that this actually makes some usernames that could be used in an URL illegal: URLs can contain CRLF SP|HP according to the RFC!
Further reading
- Percent-encoding
- RFC3986 - Uniform Resource Identifier (URI): Generic Syntax
- http://docs.python.org/library/stringprep.html
References
<references />