How do I read a text file having Unicode codes?

I initialize a string using the following code.

  std::string unicode8String = "\u00C1 M\u00F3ti S\u00F3l";

Printing it using cout, the output is Á Móti Sól.

But when I read same same string from a text file using ifstream, store it in a std::string, and print it, the output is \u00C1 M\u00F3ti S\u00F3l.

The content of my file is \u00C1 M\u00F3ti S\u00F3l and I want to print it as Á Móti Sól. Is there any way to do this?


Off the top of my head (completely untested)

std::string convert_string(const std::string& in)
    std::string out;
    for (size_t i = 0; i < in.size(); )
        if (i + 5 < in.size() && in[i] == '\\' && in[i+1] == 'u' && 
            in[i+2] == '0' && in[i+3] == '0' && 
            isxdigit(in[i+4]) && isxdigit(in[i+5]))
            out += (unsigned char)16*in[i+4] + (unsigned char)in[i+5];
            i += 6;
            out += in[i];
    return out;

But this won't work with any unicode values above 255, (e.g. \u1234) because you have the fundamental problem that your string stores 8 bit characters, and Unicode characters can have up to 20 bits.

As I said completely untested, but I'm sure you get the idea.

Can you try printing using "std::wcout"!

The unicode characters have a different representation in a text file (There is no \u).

For Evaluation

int main()
    // Write
        std::string s = "\u00C1 M\u00F3ti S\u00F3l";
        std::ofstream out("/tmp/test.txt");
        out << s;
    // Read Text
        std::string s;
        std::ifstream in("/tmp/test.txt");
        std::getline(in, s);
        std::cout << "Result: " << s << std::endl;
    // Read Binary
        std::ifstream in("/tmp/test.txt");
        std::istream_iterator<unsigned char> first(in);
        std::istream_iterator<unsigned char> last;
        std::vector<unsigned char> v(first, last);
        std::cout << "Result: ";
        for(unsigned c: v) std::cout << std::hex << c << ' ';
        std::cout << std::endl;
    return 0;

On Linux with UTF8: Result: Á Móti Sól Result: c3 81 20 4d c3 b3 74 69 20 53 c3 b3 6c

