printf
and scanf
are the two standard C programming
language functions for console input and output.
A variation of these commands (fprintf
and fscanf
)
also allows I/O to files.
Another (sprintf
and sscanf
) allows
I/O to strings.
(sscanf
is especially useful.)
All these as well as many other I/O functions are declared in the
standard header file <stdio.h>
.
Using scanf
is trickier than printf
.
This is because of promotion, so that any type smaller than
an int
is promoted to an int
when passed to
printf
, and float
s are promoted to
double
.
This means there is a single code for most integers and one for most
floating point types.
There is no such promotion with scanf
arguments, which are
actually pointers.
So all types must be exactly specified.
In addition there are security concerns with input that don't apply to
output.
(These will be discussed below.
In particular, never use the gets()
function in
production-quality C code!)
Not discussed here is the new C23 Unicode (and wide character) support.
(Hint: Use the new *wprintf
and *wscanf
functions.)
A call to printf
looks like this:
printf( "format-string", expression, ... );
Or you can use fprintf
to send the output to the screen regardless of
any output redirection like this:
fprintf( stderr, "format-string", expression, ... );
The format-string can contain regular characters which are simply printed out, and format specifications or place-holders. For each place-holder in the format-string there must be one matching expression. The expressions are converted to strings according to the instructions in the corresponding place-holder and are mixed with the regular text in the format-string. Then the whole string is output. Here's an example:
printf( "%i + %i = %i\n", 2, 3, (2+3) );
2 + 3 = 5
The following table shows the different format letters you can use with
printf
.
Each letter corresponds to a different type of argument expression.
It is important to use the correct letter that matches the type
of the expression.
The use of any other letter results in undefined behavior.
(Note the %a
and %A
are only available in C99, the
others should be available with any standard C compiler.)
The arguments can be any expression of the correct type (such as literals),
but usually are variables whose values were computer earlier.
In the examples, remember that 17
is an int
literal,
17L
is a long int
literal, 017
is an unsigned
octal literal (with the decimal value 1*8 + 7 = 15), 0x17
, 0X1A
are unsigned
hexadecimal literals with decimal values of 1*16 + 7 = 23,
and 1*16 + A = 16 + 10 = 26, 17u
is an unsigned
decimal integer
literal, 'A'
is a char
literal with a decimal value of 65,
3.14
and 0.314E1
are double
literals, 3.14f
is a float
literal, and finally 3.14L
is a long double
literal.
Letter | Type of Matching Argument | Example | Output |
---|---|---|---|
% | none (See note) | printf( "%%" ); | % |
b | unsigned int (See note) |
printf("%b", 5) | 101 |
d, i | int (See note) |
printf( "%i", 17 ); | 17 |
u | unsigned int (Converts to decimal) |
printf( "%u", 17u ); | 17 |
o | unsigned int (Converts to octal) |
printf( "%o", 17 ); | 21 |
x | unsigned int
(Converts to lower-case hex) |
printf( "%x", 26 ); | 1a |
X | unsigned int
(Converts to upper-case hex) |
printf( "%X", 26 ); | 1A |
f, F | double
(See note) |
printf( "%f", 3.14 ); | 3.140000 |
e, E | double
(See note) |
printf( "%e", 31.4 ); | 3.140000e+01 |
g, G | double
(See note) |
printf( "%g, %g", 3.14, 0.0000314 ); | 3.14, 3.14e-05 |
a, A | double
(See note) |
printf( "%a", 31.0 ); | 0x1.fp+4 |
c | int (See note) |
printf( "%c", 65 ); | A |
s | string (See note) | printf( "%s", "Hello" ); | Hello |
p | void* (See note) |
int a = 1; printf( "%p", &a ); | 0064FE00 |
n | int* (See note) |
int a; printf( "ABC%n", &a ); | ABC (a==3) |
printf
Conversion Specification Formatting Syntax
To control the appearance of the converted arguments, any or all (or none) of
the following format controls may be used between the % and the final letter of the
conversion specification.
Note these must appear (if at all) in the sequence shown here.
A ·
is used to indicate a space in the example output where spacing is not obvious.
% flags minimum-field-width .precision length
Letter
Format Control | Description | Example | Output | |
---|---|---|---|---|
flags | The flag characters may appear in any order and have the following meanings: | |||
- left-justify within the field. |
printf( "|%3i|%-3i|", 12, 12); | |·12|12·| | ||
+
Forces positive numbers to include a leading plus sign. |
printf( "%+i", 17); | +17 | ||
space Forces positive number to include a leading space.
|
printf( "|% i|", 12); | |·12| | ||
# This flag forces the output to be in some
alternate form. |
printf( "%#X", 26); | 0X1A | ||
0
Pad with zeros rather than spaces. |
printf( "|%04i|", 12); | |0012| | ||
'
Format integers with the current locale's thousands'
grouping character.
|
printf( "|%'i|", 1234567); | |1,234,567| | ||
minimum field-width | After converting any value to a string, the field width represents the minimum number of characters in the resulting string. (See note.) If the converted value has fewer characters, then the resulting string is padded with spaces (or zeros) on the left (or right) by default (or if the appropriate flag is used.) | printf( "|%5s|", "ABC"); | |··ABC| | |
Sometimes the minimum field width isn't known at compile-time, and must be computed
at run-time.
(For example, printing a table where the width of a column depends on the widest column
value in the input.)
In this case the field width can be specified as an asterisk ("* "), which acts
like a place-holder for an int value used for the field width.
The value appears in the argument list before the expression being converted. |
printf( "|%-*s|", 5, "ABC" ); | |ABC··| | ||
.precision | A period by itself implies a precision of zero. A precision may be replaced with an asterisk ("*"), which works exactly the same as for an asterisk minimum field width described above. The meaning of a precision depends on the type of conversion done. Only the conversions listed below are defined: | |||
When used with floating-point conversion letters (a, A, e, E, f, F, g, and G) the precision specifies
how many digits will appear to the right of the decimal point.
The default precision is six.
(For conversion letters g and G, the precision is actually the maximum number of significant digits.)
The value displayed is always rounded, but note this doesn't change the matching expression in any way.
If the precision is zero, no decimal point appears at all (but see "# " flag above). |
printf( "|%5.2f|", 3.147 ); | |·3.15| | ||
When used with integer conversion letters (d, i, o, u, x, and X) the precision specifies the minimum number of digits to appear. Leading zeros are added as needed. | printf( "|%6.4i|", 17 ); | |··0017| | ||
When used with string conversions (letter "s") the precision specifies the maximum number of bytes written. (See note.) If the string is too long it will be truncated. | printf( "|%-5.3s|", "ABCD" ); | |ABC··| | ||
length |
A length modifier is used to exactly specify the type of the matching argument.
Since most types are promoted to int or double a length modifier is
rarely used.
However it is used for long and other types that don't have an explicit conversion letter
of their own.
Note that specific length modifiers only make sense in combination with specific conversion letters.
Using undefined combinations causes unpredictable results.
The length modifiers and their meanings are: | |||
|
printf( "%hhi", 300 ); (See note)
| 44 | ||
|
printf( "%hi", 300 ); | 300 | ||
|
long a = 300, b = (long) 1.0E+14; |
| ||
|
printf( "%#llX", 300 ); | 0X12C | ||
|
printf( "%ji", 17 ); | 17 | ||
|
printf( "%zi", sizeof(int) ); | 4 | ||
|
char a[5] = "abcd"; |
| ||
|
printf( "%Lf", 3.14L ); | 3.140000 |
The scanf
(and related functions) are more difficult to use
safely and correctly.
Here we will start with a simple use, but such is almost never safe to use
in practice!
A call to scanf
looks like this:
scanf( "conversion-string", &variable, ... );
Or, reading from a file using a file handle (stdin
is a predefined
file handle but you can define your own via fopen
) looks like this:
fscanf( stdin, "conversion-string", &variable, ... );
Regular characters | This is text that must be matched character by character with the input. Such entries are rarely used for interactive programs, but can be handy when working with formatted input files. (See below for an example.) |
white-space characters | A blank, tab, or other white-space character will match any amount (including none) of any white-space. (So a single space will match any string of white-space, including carriage returns and newlines.) Note that it is legal for this to match no input at all (if there isn't a blank or tab, it is okay). |
Conversion Specifiers |
Similar to printf conversion specifiers but just different enough to
cause many errors.
They all begin with a percent and end with a letter indicating the type of conversion.
In between can be some special conversion controls, including the length.
Unlike printf , failing to use the exact type and length for the conversion will
result in unpredictable errors.
Most conversion specifiers will skip any leading white space characters,
but there a couple of exceptions (see below).
Since few compilers will check the conversion-string for argument mismatches, the result
is a runtime (logic) error that can be hard to find.
These conversion specifiers match a string of characters in the input, convert to the specified type (and
length), and store the result in the RAM address provided by the corresponding argument.
(The most common error with scanf is not using the address-of
operator in front of a variable name for the argument.)
|
scanf
Return Values
scanf
returns a useful error code.
The return value is an int
which indicates the number of conversions requested
that (1) matched some input, (2) were converted without error, and (3) were assigned without
any problems.
(Matching only, or matching and converting only, doesn't count in the return value.)
Depending on the error encountered the return value may be zero, EOF
(a symbolic
constant usually defined to be -1
), or some other integer
less than the number of requested conversions.
Because so many problems in programs are a result of bad user input, it is common practice
in production-quality code to always check the return value of scanf
.
Here's an example use of scanf
that attempts to read in two integers from an
input file called foo
that is formatted with lines like this:
Height: 12, Width: 34
The C code fragment to read the numbers into variables height
and width
should look something like this:
int height, width; if ( fscanf( foo, "Height: %i, Width: %i", &height, &width ) != 2 ) { fprintf( stderr, "###Error with Scanf: bad input data.\a\n" ); // Do error processing, maybe just "continue" or "break". }
Here the fscanf
is requesting two conversions, so if all goes well the return value
should be 2.
Note how the fscanf
uses all three types of entries (regular text, white-space, and
conversion specifiers).
Although text such as "Height:
" and ",
" are matched, they don't count
toward the return value.
The system keeps track of which input has been seen so far.
Every call to scanf
picks up from where the last one stopped matching
input.
This means that if an error occurred with the previous scanf
, the input it
failed to match is still left unread, as if the user typed ahead.
If care isn't taken to discard error input, and a loop is used to read the input,
your program can get caught in an infinite loop.
(See below for an example and further discussion.)
For example, consider this program fragment that reads in an age; assume it is inside a loop:
scanf( "%i", &age );
If the input is "help
" instead of a number, this will cause the scanf
to fail
when attempting to match an integer ("%i
"), and the word help
is left unread.
So the next time through the loop, the scanf
doesn't wait for fresh
user input, it tries to convert help
again.
Similarly, if the input were "17.5
", the "%i
" will match the
first two characters only (the 17
), leaving the .5
as unread
input for the next call to scanf
.
Even if the input is correct, as "29
", the newline that ended the input
is still left unread.
Normally that isn't a problem since most conversions automatically skip leading
white-space such as the trailing newline from the previous line.
However some conversions ("%c
" and "%[
") don't skip any
leading white-space so you have to do it manually.
Note that all input functions that read from stdin
share the same
input buffer,
so if a call to scanf("%i", &anInt);
is followed by
a call to getchar()
, the newline left unread by scanf is read in now.
This is not usually what is wanted.
A final warning: Some older compilers will not match any regular text after
the last conversion specifier in the conversion-string.
This bug would prevent the example for "%%
" below from working correctly.
Letter | Type of Matching Argument | Auto-skip Leading White-Space |
Example | Sample Matching Input |
---|---|---|---|---|
% | % (a literal, matched but not converted or assigned) |
no | int anInt; |
23% |
d | int (See note) |
yes | int anInt; long l; |
-23 200 |
i | int (See note) |
yes | int anInt; |
35 |
b | unsigned int
(See note) |
yes | unsigned int aUInt; |
1001 |
o | unsigned int (See note) |
yes | unsigned int aUInt; | 023 |
u | unsigned int (See note) |
yes | unsigned int aUInt; | 23 |
x | unsigned int (See note) |
yes | unsigned int aUInt; | 1A |
a, e, f, g | float or double
(See note) |
yes | float f; double d; |
1.2 3.4 |
c | char (See note) |
no | char ch; | Q |
s | array of char (See note) |
yes | char s[30]; | hello |
p | void (See note) |
yes | int* pi; void* ptr; |
0064FE00 |
n | int (See note) |
no | int x, cnt; |
X: 123 (cnt==6) |
[ | array of char (See note) |
no | char s1[64], s2[64]; |
Hello World |
Additional Notes:
For the b , o , and x conversions,
you cannot read in a signed int value (since the target variable is
an unsigned int ).
However, using a length modifier of h , hh ,
l , or ll , you can input signed binary, octal,
or hex numbers into the corresponding (signed) types.
|
scanf
Format Specification Syntax
The control of input conversion is much simpler than for output conversions.
Any, all, or none of the following format modifiers may be used between the %
and the final letter of the conversion specification.
Note these must appear (if at all) in the sequence shown here.
A ·
is used to indicate a space in the example output where spacing is not obvious.
% * maximum-field-width length Letter
Conversion Modifier | Description | Example | Matching Input |
Results | |
---|---|---|---|---|---|
* |
Assignment Suppression. This modifier causes the corresponding input to be matched and converted, but not assigned (no matching argument is needed). | int anInt; |
Age:·29 |
anInt==29, | |
maximum |
This is the maximum number of character to read from the input.
Any remaining input is left unread.
(Always use this with "%s " and "%[...] " in
all production quality code!
(No exceptions!)
You should use one less than the size of the array used to hold the result.) |
int anInt; char s[10]; |
2345 |
anInt==23, | |
length |
This specifies the exact type of the matching argument. These length codes are the same as the printf length modifiers, except as noted below: | ||||
|
double d;
| 3.14 | d==3.14 |
scanf
and Other Input Techniques
The scanf
call:
int i; float x; char name[50]; scanf( "%2d%f%*d %[0123456789]", &i, &x, name );
With this input:
56789 0123 56a72
will assign to i
the value 56
and to x
the value 789.0
,
will skip 0123
, and assign to name
the sequence 56\0
(the string "56
").
The next character to be read from the input will be a
.
A simple (and common) example reads an int
from a user this way:
int age; for ( ;; ) { fprintf( stderr, "Please enter your age: " ); if ( scanf( "%i", &age ) == 1 ) break; // Do some sort of error processing: fprintf( stderr, "\nError reading in the age, please try again.\n" ); } printf( "You are %i years old.\n", age );
Note the use of fprintf
to send the output to the screen even if output was
redirected.
Some common error processing is to reset the variables and try again, using a loop
as shown here.
Sometimes a count of attempts is kept and the user is only given a certain number of
attempts before the program gives up.
A problem with scanf
is that it leaves any unmatched input unread.
This may be a problem for applications that expect line oriented input.
When each line (or record) of input is to be processed independently, an
error such as bad data on one line can cause errors when attempting to read the
following line.
Consider the code above to read in an age.
If the input entered by the user was non-numeric such as the word "help
",
the "h
" would not match the "%i
" and would be left unread.
When the for
loop repeated, the scanf
would encounter the "h
"
again and immediately fail.
This would cause an infinite loop!
A similar problem would exist if the user entered "29.5
" for their age.
The first time through the loop the scanf
would read the 29
.
If the next input expected was a person's name or ID or whatever, the ".5
"
will be read next.
Another common problem with this approach is mixing scanf
with getchar
or getc
.
The scanf
typically leaves the newline unread, so a call to read the
next character retrieves that instead of the character the programmer expected to get.
(This problem may be worse on DOS based systems, which have two characters to mark
the end of lines.)
The solution is to use fgets
(see note)
to read input a line at a time into a buffer,
then use the sscanf
function to parse the contents of the buffer.
With this approach each input operation consumes (reads) an entire line of input,
even if it had errors.
The next input operation starts fresh with the next line of input.
Here's an example to illustrate the technique:
char buf[BUFSIZ]; /* Buffer for a line of input. */ int age; fprintf( stderr, "Please enter your age: " ); // Loop until the user enters a correct value: while ( fgets( buf, sizeof(buf), stdin ) != NULL ) { if ( sscanf( buf, "%i", &age ) == 1 ) break; // Do some sort of error processing: fprintf( stderr, "\nError reading in the age, please try again." ); fprintf( stderr, "Please enter your age: " ); } // age has been correctly read at this point.
The call to fgets
reads all input up to and including a newline.
It then copies that line of input into buf
, adding a '\0'
at the
end to form a valid C string.
The terminating newline is also copied into buf
.
On EOF
, fgets
returns NULL
.
(EOF
and NULL
are defined in <stdio.h>
.)
If the input is larger than the size of the buffer, than only the input that will
fit is consumed (read).
Note fgets
is smart enough to reserve space for the '\0'
from
the size given.
In this case the maximum input read would be BUFSIZ-1
.
The sscanf
works just like scanf
or fscanf
.
The first argument to sscanf
is the string to read from
(instead of stdin
as for scanf
).
If the fgets
doesn't detect EOF
but the sscanf
fails to match any input using "%s"
, the input must have been a blank
line.
When using "%i"
and not a "%s"
,
the return value doesn't tell if the input was a blank line or some other error.
The line at a time example above works well
but doesn't detect all errors.
Consider what would happen if the user entered 29.5
for an age, or
32,500
for some numeric value (such as a person's income in dollars).
While the fgets
will read the whole line, the sscanf
will
only read 29
in the first case and 32
in the second.
In both cases sscanf
will return 1
and the errors would
go undetected.
In order to detect such extra input on the line scanf
must
attempt to match it, convert it, and assign it to a variable.
Then the return value will be 2
if extra input was present.
There are two ways to do this.
If extra white space is not considered an error you can use "%s
"
instead of the "%[
" used below:
char buf[BUFSIZ], junk[BUFSIZ]; int income; fprintf( stderr, "Please enter your income: " ); // Loop until the user enters a correct value: while ( fgets( buf, sizeof(buf), stdin ) != NULL ) { if ( sscanf( buf, "%i%[^\n]", &income, junk ) == 1 ) break; // Do some sort of error processing: fprintf( stderr, "\nError reading your income, please try again.\n" ); fprintf( stderr, "Please enter your income: " ); } // income correctly enters read at this point.
Here the sscanf
will skip leading white space (%i
does
this automatically), match digits until the first non-digit, convert the
matched string to
an int
and assign the result to income
.
sscanf
then matches any remaining
characters (up to but not including a newline) and stores the string
in junk
.
If there were no input errors, the %i
would succeed but the
%[^\n]
will fail to match any input.
The return value would therefore be 1
.
However if any extra input was encountered the %[^\n]
will match it
and assign the string to junk
.
This would cause sscanf
to return 2
.
If the user input was "$32500
, the %i
would fail to
match anything and the return value would be 0
.
This technique will therefore catch any input errors and
consume the entire line (record) whether or not errors were present.
Below is a table of character constants.
These can be used individually (as a char
literal by surrounding
the constant with single quotes) or as part of a double-quoted string literal.
Constant | Meaning |
---|---|
\' | A single quote |
\" | A double quote |
\? | A question mark |
\\ | A backslash |
\a | Alert sound |
\b | A backspace |
\f | A form-feed |
\n | A newline |
\r | A carriage-return |
\t | A tab |
\v | A vertical tab |
\ooo | Octal constant (up to three octal digits) |
\xHH | Hexadecimal constant (one or two hex digits) |
\uHHHH | Unicode constant (four hex digits) |
\UHHHHHHHH | Long Unicode constant (eight hex digits) |
The C standard doesn't use the term Unicode character much, the authors prefer
the term Universal character.
It's pretty clear that they mean Unicode though, since it is referred to by it's
official name of "ISO/IEC 10646" in several places.
Unicode is actually a 4 byte per character encoding, however most non-Asian characters
are found in the lower half of the character set, so the 2 byte form is common.
(Actually Unicode is often stored in files in a form called UTF-8
.)
Such characters are called wchar_t
constants.
You can make a literal for one as follows: L'\u00A9'
(which is the
same character as '\xA9'
, namely the "©
" symbol).
You can also form wide strings such as: L"\u00A92001"
,
which translates to the string "©2001
".
(Technically, regular strings are multi-byte character strings as they
can often hold multi-byte characters.
It depends on the locale, the compiler,
the platform (Windows allows only 1 or 2 byte characters),
and the specific character in question.
There's a whole host of string and character conversion functions such as
mbstows
(multi-byte string to wide string) and
wstombs
.
To deal properly with all Unicode input or output, I generally use
the ICU
Unicode library.
This information was extracted from ISO/IEC 9899, second edition (the C99 Standard), mostly from sections 7.19.6.1 and 7.19.6.2. Some information comes from a draft of the C23 standard.
Send comments and mail to the WebMaster. |