Return value of istream::get()

This question is about the member function of basic_istream:

int_type get();

as described by N3337 27.7.2.3#4 (that is [istream.unformatted]). Presumably the actual Standard text is the same.

The text says:

After constructing a sentry object, extracts a character c, if one is available

Returns: c if available, otherwise traits::eof()

This text suggests that negative chars should return a negative value. We could compare with the next section, basic_istream<charT,traits>& get(char_type &c), which says:

After constructing a sentry object, extracts a character, if one is available, and assigns it to c.

This is very similar wording to get().

However, when I try get(), negative chars return a positive value; i.e. basic_istream::get() behaves like the C function getchar(). This would be the sensible behaviour (to allow signalling of EOF), however the Standard text does not seem to specify this. The C99 description of getchar() and friends specifically says that it returns the value converted to an unsigned char. But basic_istream::get() does not have any equivalant text.

My question is: is get() meant to be specified to return a value in the range 0...UCHAR_MAX union EOF? Or should it return the actual char converted to int_type (via implicit conversion)? Or something else? What exactly does and does not the Standard specify here?

If "something else", how do I transform the result of int i = cin.get() to match the char value read by char ch; cin.get(ch); for the same input character?

Answers


To clear up your confusion, the difference just doesn't matter. Think about what you can or can't do with the returnvalue of parameterless get(). The funny thing is that you can't reliably compare it to EOF or traits_type::eof(), because it's not ever guaranteed to be equality-comparable (it is for builtin char and wchar_t though). In order to compare it correctly, you just use traits_type::eq_int_type(). Similarly, in order to extract a character from it after checking for EOF, you use traits_type::to_char_type(), and that function then converts the type accordingly. Similarly, get() can't use the implicit conversion but has to use traits_type::to_int_type().

In summary, the guarantee for getchar() that it returns the "unsigned" value or EOF is not necessary, since the traits_type encapsulates this knowledge and should be used for correct code.

Example use of parameterless istream::get():

traits_type::int_type c = in.get();
if(traits_type::not_eof(c))
    my_string += traits_type::to_char_type(c);

Similar use of single-parameter istream::get():

traits_type::char_type c;
in.get(c);
if(in) // check for EOF or other input failure
    my_string += c;

As per [char.traits.typedefs]

typedef INT_T int_type;

Requires: For a certain character container type char_type, a related container type INT_T shall be a type or class which can represent all of the valid characters converted from the corresponding char_-type values, as well as an end-of-file value, eof(). The type int_type represents a character container type which can hold end-of-file to be used as a return type of the iostream class member functions.

The only requirement by the standard is that int_type is large enough to hold all of the values of char_type (it doesn't even have to be a fundamental type), plus the value returned by eof(), the standard also requires however that char_traits<char>::int_type is int, and that char_traits<wchar_t> is wint_t.

The reason you are seeing your chars cast to an unsigned value is because GCC is making sure that EOF and the character value 0xff are distinct behind the scenes in char_traits<char>::to_int_type by casting the character to unsigned char before returning the int (note that the standard also requires that char_traits<char>::eof returns EOF, and WEOF for wchar_t). Without the cast 0xff would be sign extended to the equivalent of EOF, at least for GCC.

As for casting (even implicitly) the return value of get to a char, this works for GCC due to the way it handles signed conversion.

The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

This isn't portable however, and you should be using traits_type::to_char_type on the return value first (after checking for EOF / eof() of course...).


Need Your Help

Safe, efficient way to access unaligned data in a network packet from C

c networking memory-alignment

I'm writing a program in C for Linux on an ARM9 processor. The program is to access network packets which include a sequence of tagged data like:

New version of Rake Pipeline doesn't like my config.ru

ember.js rack rake-pipeline

I'm working on an Ember.js project (around 0.9.8.x, if it matters) which was built with ember-skeleton. I recently made the mistake of haphazardly running bundle update and my version of rake-pipel...

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.