How to detect the character encoding of a text file?
        Posted  
        
            by 
                Cédric Boivin
            
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Cédric Boivin
        
        
        
        Published on 2010-12-23T15:40:15Z
        Indexed on 
            2010/12/23
            15:54 UTC
        
        
        Read the original article
        Hit count: 854
        
I try to detect which character encoding is used in my file.
I try with this code to get the standard encoding
public static Encoding GetFileEncoding(string srcFile)
    {
      // *** Use Default of Encoding.Default (Ansi CodePage)
      Encoding enc = Encoding.Default;
      // *** Detect byte order mark if any - otherwise assume default
      byte[] buffer = new byte[5];
      FileStream file = new FileStream(srcFile, FileMode.Open);
      file.Read(buffer, 0, 5);
      file.Close();
      if (buffer[0] == 0xef && buffer[1] == 0xbb && buffer[2] == 0xbf)
        enc = Encoding.UTF8;
      else if (buffer[0] == 0xfe && buffer[1] == 0xff)
        enc = Encoding.Unicode;
      else if (buffer[0] == 0 && buffer[1] == 0 && buffer[2] == 0xfe && buffer[3] == 0xff)
        enc = Encoding.UTF32;
      else if (buffer[0] == 0x2b && buffer[1] == 0x2f && buffer[2] == 0x76)
        enc = Encoding.UTF7;
      else if (buffer[0] == 0xFE && buffer[1] == 0xFF)      
        // 1201 unicodeFFFE Unicode (Big-Endian)
        enc = Encoding.GetEncoding(1201);      
      else if (buffer[0] == 0xFF && buffer[1] == 0xFE)      
        // 1200 utf-16 Unicode
        enc = Encoding.GetEncoding(1200);
      return enc;
    }
My five first byte are 60, 118, 56, 46 and 49.
Is there a chart that shows which encoding matches those five first bytes?
© Stack Overflow or respective owner