How to check file encoding in Linux? Handling multilingual scripts

My company has php scripts with texts in different languages (including french, german, spanish, italian and english).

Developers decided to use Latin-1 encoding as base for everyone, so this way nobody will override file encoding and corrupt foreign languages in it. (At first some developers used html entities, but this way is not preferred)

I have few questions for you:

  1. How can you check file encoding on linux?
  2. If you had experience working with files in different languages, how did you manage to not override encoding of others?

Thanks for any advise in advance

Answers


Developers decided to use Latin-1 encoding as base for everyone, so this way nobody will override file encoding and corrupt foreign languages in it.

Latin-1 can't handle most languages. Flavours of Unicode (typically UTF-8) are preferred.

How can you check file encoding on linux?

With the file utility. It can only guess though.

If you had experience working with files in different languages, how did you manage to not override encoding of others?

Sensibly configured editors.


file gives you informations about a file, including, charset, languages, etc.. depending on file type.

Use --mime-encoding to get only the information you want.


1. I have used iconv for converting back and forth, but since you don't know the encoding, try enca (Extremely Naive Charset Analyser) first. But in general, it is very hard to get it right since it requires knowledge of common words etc.

2. The only sane approach is to use a larger charset such as unicode for this. You could enforce this by adding a pre-checkin hook to your source control system which only allows correctly formatted utf-8 files (for instance).


There is no reliable way of checking the encoding of a file; the various 8-bit single-byte encodings are virtually indistinguishable without inspection. Using UTF-8 everywhere means that everyone has a single, universally-valid encoding to work with.


Need Your Help

How to use dictionary in Python correctly

python class python-2.7 dictionary

Python dictionaries have always confused me.

Activity switching from MapActivity to Activity

android android-activity android-intent switch-statement

I'm trying to switch to another activity which holds tabs (Map Activity Main -> Tab Activity and backwards) like this

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.