Non-English UTF-8 Language Support
Posted by Joe Rebis (Import) on 10 August 2007 06:05 PM
EPhost works hard to ensure that our Control Panels, Statistics and Web based email applications are provided in many languages. However, there are some potential issues when trying to use a language other than English for file and folder names in your website. The first has to do with actual files and folders on the operating system. The second has to do with encoded URL requests as they relate to our security precautions. Lastly, your FTP client and our FTP server also play a factor.
Operating Systems / Hosting Accounts
Officially, our operating systems / hosting plans (Windows and Linux) are configured for English language support, and technical support is solely provided in English. We highly suggest using English language file/folder names exclusively. While a problem might not reveal itself right away, there will likely be a problem later. This doesn't mean your website application itself won't support some non-English language options. e.g. DotNetNuke (DNN), Java, .NET, PHP
NOTE: Modern HTML5 and Modern Operating Systems recognize UTF-8 and UTF-16.
Application, Frameworks & Platforms
Foreign language support is provided by applications such as DNN or even WordPress or other php based applications, but sometimes there are limitations. For instance, some DNN modules that try to create actual folders in a language other than English, won't be named/stored correctly as our file system is configured for English. DotNetNuke users should note that a DNN page (tab) is handled differently as no actual folder created (although it may appear that way in the URL due to URL Rewriting). So, you can have page names (tabs) in DNN in your language and even "appear" as such in the URL. However, the actual file folders in the operating system are in English.
NOTE: UTF-8 and UTF-16 are used in major modern applications, frameworks and platforms (e.g. ASP.NET, DNN, Java, .NET, and PHP)
Have you ever seen a URL with % signs and odd looking letters? These are encoded URLs. Character encoding can be a huge issue if you are not using standard characters in your website. We take precautions so that encoded URLs are checked twice for security issues before being allowed. This usually applies to non-English URLs too. If we are unable to "normalize/decode" the URL then we will reject the request in an attempt to avoid a security problem. This also applies to the length of your encoded URL link (request). Some longer non-English language file and folder names may translate into very long encoded URLs which we restrict for security reasons.
You may see files (e.g. ????.??) in FTP. This is due to attempting to upload/create files in a different language. Your ability to see these files depends on your FTP client's UTF-8 abilities and how the operating system treats them. You may find that you are able to FTP and even see the files and folders in a different language, but file operations (e.g. DNN displaying a directory of images) on those files MAY fail.
Again, we suggest using English language file/folder names. Our operating systems are configured for English language support. Additionally, removing the URL scanning significantly reduces our ability to scan for SQL injection and Cross Site-Scripting attacks which is a major threat to your website.
If you are still having issues, please contact us for further help.