Back to: C Tutorials For Beginners and Professionals
Strings in C Language with Examples:
In this article, I will discuss the Strings in C Language with Examples. Please read our previous section discussing Arrays in C Language with Examples. As part of this article, you will learn the following pointers in detail with examples.
- Character Set / ASCII Codes
- Character Array
- Creating a String
- What are Strings in C?
- Why do we need strings?
- Declaration and Initialization of String
- Memory Representation of String
- Multiple Examples to Understand String in C.
- What do you mean by Formatted and Unformatted Functions?
Character SET / ASCII Codes in C Language:
In C language, characters are represented using the ASCII (American Standard Code for Information Interchange) character set. ASCII is a character encoding standard that assigns a numerical value to each letter, digit, and symbol. Here’s a brief overview of how character sets and ASCII codes are used in C:
- Character Data Type: The char data type stores a single character in C. A character in C is typically 1 byte (8 bits) in size.
- ASCII Values: Each character in the ASCII character set represents a unique integer value. For example, the ASCII value for ‘A’ is 65, and for ‘a’ is 97. These values are integral to how characters are processed in C.
- Character Literals: In C, character literals are enclosed in single quotes, e.g., ‘A’, ‘b’, ‘1’, etc. These literals are stored as their corresponding ASCII values.
- Escape Sequences: C also supports several escape sequences for non-printable characters, like newline (\n), tab (\t), backspace (\b), etc. These are also part of the ASCII character set.
- Using ASCII Values in C: You can perform arithmetic operations with characters in C because they are essentially stored as their ASCII values. For example, ‘A’ + 1 will result in 66, which is the ASCII value of ‘B’.
- String Representation: In C, strings are represented as arrays of characters, terminated by a null character (‘\0’), which is also part of the ASCII set.
- Extended ASCII Set: Some systems use the extended ASCII set, which includes additional characters and symbols and uses 8 bits for each character, allowing for 256 different symbols or characters.
For every letter of a character, there is a code available. Below is the list of some ASCII Codes: So, uppercase letter codes starting from 65 (A) up to 90 (Z), lowercase letter codes from 97 (a) up to 122 (z), and numbers starting from 0 (48) up to 9 (57).
How a Character is Represented in C Language?
Now, let us understand how a character is represented and what is a character array. Let us see how to declare a character type variable in C and C+ +:
Here, char is the data type, and we declare a variable named temp. It takes one byte of the memory. And if we want to store something, we can initialize it with the character. For example, we store the value ‘A’ here. As the data type is char, the value we specified must be in single quotes, and we can only give a single alphabet. For a better understanding, please look at the following image.
Now, what is actually stored in the memory? Actually, inside the memory, the value 65 value is stored. It’s not A. A is not represented in computer memory. To print this A, we write:
Here printf will print 65, but we have given our control character as %c, so it will print A on the screen, and if I make it %d, it will display 65 on the screen.
Character Array in C Language:
In C programming, a character array is a sequence of characters stored in contiguous memory locations. This concept is fundamental for handling strings in C because the language doesn’t have a native string type.
Now, let us see how we will create an array of characters. A character array is defined like any other array in C but with char as its data type. For creating an array of characters, just like any other variable declaration, we must provide the array name and size as follows. Here, we create the character with the name B with size 5. The following is only the array declaration.
While creating the array, we can also initialize the array as follows: This is declaration plus initialization.
Specifying the array size is optional if we initialize the array while creating. For a better understanding, please look at the following declaration. Here, the array B will be created with five alphabets, and here, we have not specified the array size. This is how we can create an array without giving it any size.
Here, the array will be created of size 5 and initialized with all these alphabets. Here, the size will be taken depending on the number of alphabets we assign. So, we can create an array by either mentioning or not mentioning the size. Now, look at the following example. Here, we created one array with size 5 and mentioned only two alphabets.
So, now the array is created with two alphabets, i.e., A and B. As the array size is 5, the rest of the places are empty/vacant and not in use.
Strings in C Language:
In C, strings are represented as null-terminated character arrays. The end of the string is marked by a special character, the null character (‘\0’), which is an ASCII value of zero. Now, we want to store a name in an array, so we will create an array of characters with the name boy and give the size as 10, and here we will store the string, Rohan:
In C, a string is nothing but a set of characters. So, the name ‘Rohan’ is a string. Now, the string will be represented in the memory as follows:
See, here, the size of an array is 10, but the string size is only 5 alphabets. Then how do I know where this string is ending?
In C and C++, the end of the string is marked as a null character that is ‘\0’. ‘\0’ is a null symbol. We can also say that a string delimiter is the end of the string, the null character, or the string terminator. This is used to show the end of the string. So, in C or C++, strings are terminated with a null character that is ‘\0’. But strings will not have ‘\0’ in another language like Java, C#.
Then how do you know how many alphabets are valid? In C or C++, the size of a string is known by finding a termination character that is the null character, so strings are delimited by ‘\0’.
The above declaration is an array of characters. How do we make it as a string in C / C++? We must write ‘\0’ also. For a better understanding, please look at the following declaration.
Now, this character array becomes a string. Without ‘\0’, it is just an array of characters. This is the difference between an array of characters and a string.
How Many Ways We Can Create Strings in C?
Now, let’s see the methods for creating or declaring a string and initializing it. The following is the 1st method for declaring and initializing a string. Here, we have specified the size as well as the string characters.
In the 2nd method, we can declare a string without any size. For a better understanding, please look at the following declaration. Here, we have created the string without size but initialized the array with characters.
Then, what will be the size of this array? The size of this array is 6, so storing five alphabets of a name also provides a space for ‘\0’ as it also consumes memory. The 3rd method of declaring or initializing a string is as follows: We can write down the name in double quotes. So only ‘ROHAN’ is written in double quotes, so ‘\0’ will be automatically included. This method looks better than the previous two methods.
Even using a character pointer, we can also create a string as follows:
This is a character pointer. Then where will the string be created? This string will be automatically created in the heap memory. Even though we did not use the malloc() function, this is implicitly allocated in the heap memory. Here, the pointer variable will be created inside the stack, pointing to the array created inside the heap. For a better understanding, please look at the following diagram.
As shown in the above diagram, the pointer variable y is created inside the heap, which points to the array created in the heap. We can access the array from the heap memory using this pointer variable. The other array, i.e., boy, is created inside the stack, and we can access that array from our program.
Note: From our program, we can access the stack memory directly but not the heap memory directly. To access the heap memory, we need to use a pointer.
How to Access a String in C?
Now, let us understand how to access a string in C. Let us discuss about printing the string. Let us assume we have the following string. You can also consider this as a character array. If I am using the term string and character array, the meaning of both are the same.
For printing the above string, we need to use printf method as follows:
Here, %s is the control specifier for the string. We need to give the name of an array, and the string will be displayed. Remember, %s is not for any other array type, like integer or float.
Example: Printing the String
#include<stdio.h> int main() { // declare and initialize string char boy[] = "Rohan"; // print string printf("%s",boy); return 0; }
Output: Rohan
Now, if you want to read some new name or another name from the console, then you need to use the scanf function as follows:
The scanf built-in function reads the strings from the keyboard and stores those alphabets there, followed by ‘\0’.
Properties of Strings in C Language:
- In the declaration of a string, the size must be an unsigned integer constant whose value must be greater than zero.
- In the initialization of the string, if specific characters are not initialized, the remaining elements are automatically initialized with null(\0).
- In the initialization of the string, it is not possible to initialize more than the size of string elements.
- In the initialization of the string, if we assign a numeric value, the corresponding data will be stored according to the ASCII value.
- In the initialization of the string, specifying the size is optional. In this case, how many characters are initialized will decide the size of the array.
- When we are working with strings, it is always recommended to initialize the data in double quotes only.
- When working with a string constant, it always ends with a ‘\0’ (null) character. That’s why one extra byte memory is required, but if we are working with a character array, it doesn’t require one extra byte memory.
- When working with character operations, it is recommended to go for the %c format specifier.
- When working with string operations, it is recommended to go for the %s format specifier.
- When working with the %s format specifier, we must pass an address of a string from the given address up to null. The entire content will be printed on the console.
- When the null character has occurred in the middle of the string, we cannot print complete data because the null character indicates the termination of the string.
What do you mean by Formatted and Unformatted Functions in C?
In C programming, input and output functions are often categorized into formatted and unformatted functions. This categorization is based on whether the functions allow for format specification (formatting the input/output) or not. Understanding the distinction between these two types of functions is essential for effective I/O operations in C.
Formatted Functions in C:
Formatted functions allow us to specify a format string that dictates how the input or output should be processed. This format string includes format specifiers, which define the expected type and format of the data. The functions which will work with the help of format specifiers are called formatted functions. A formatted function can be applied to any data type. For example: printf(), scanf(), fprintf(), fscanf(), sprint(), etc.
Examples:
- Output: printf is the most common formatted output function. You can use format specifiers like %d for integers, %f for floating-point numbers, %s for strings, etc.
- Input: scanf is a common formatted input function. It uses format specifiers similar to printf to determine the type and format of the input data to be read.
Advantages: Formatted functions provide greater input/output processing control, allowing for more precise and customized I/O operations.
Unformatted Functions in C:
Unformatted functions perform input/output operations without any formatting. They deal with the data as-is, without interpreting or converting it according to any format specifiers. The functions that do not require any format specifier and need to be applied for specific data types only are called unformatted functions. For example: puts(), gets(), fputs(), cgets(), getch(), etc.
Examples:
- Output: Functions like putchar (which outputs a single character) and puts (which outputs a string) are unformatted output functions.
- Input: Functions like getchar (which reads a single character) and gets (which reads a string until a newline is encountered) are unformatted input functions.
Advantages: Unformatted functions are typically simpler and faster than formatted functions, as they involve less processing overhead.
Key Differences Between Formatted and UnformattedFunctions in C:
- Complexity and Control: Formatted functions are more complex but offer more control over how data is read/written. Unformatted functions are simpler but offer less control.
- Performance: Unformatted functions can be faster due to their simpler nature and lack of processing for format specifiers.
- Use Cases: Formatted functions are essential when handling data in a specific format, while unformatted functions are suitable for straightforward, raw data handling.
Note: We have already discussed the example of formatted functions in C. Let us proceed and see the example of unformatted functions in C.
Unformatted puts() Functions in C:
It is a predefined unformatted function declared in stdio.h. The puts() function in C is a simple and commonly used unformatted output function. It’s used to write a string to the standard output (stdout), typically the console. The puts() function automatically appends a newline character (\n) at the end of the string, which is one of its distinguishing features compared to other output functions like print(). For a better understanding, please look at the following example:
#include <stdio.h> int main() { // Define a string char myString[] = "Hello, World!"; // Using puts() to display the string puts(myString); return 0; }
In this example:
- We include the standard I/O library <stdio.h>, which contains the declaration of puts().
- We define a character array myString with “Hello, World!”.
- We call puts(myString) to print the string to the standard output.
When this program is run, it will output:
Hello, World!
followed by a newline, which means the cursor will move to the beginning of the next line in the console.
Key Points About puts()
- Simplicity: puts() is simpler to use than printf() for just displaying a string, as you don’t need to worry about format specifiers.
- Automatic Newline: puts() automatically adds a newline at the end of the output, which can be convenient but might not always be desired.
- Return Value: puts() returns a non-negative number on success and EOF (end-of-file) on an error. This can be used for error checking if needed.
- Unformatted: puts() does not allow for formatting the string, unlike printf(). It outputs the string exactly as it is.
Unformatted gets() Functions in C:
It is a predefined unformatted function that is declared in stdio.h. The gets() function in C is an unformatted input function that reads a string from standard input (stdin), typically the keyboard. However, it’s important to note that gets() is unsafe because it doesn’t perform bounds checking and can lead to buffer overflows. This vulnerability can cause serious security risks, and as a result, this method has been deprecated and removed from the C11 standard.
Instead of gets(), it’s highly recommended to use fgets(), a safer alternative, as it allows us to specify the maximum number of characters to be read, thus preventing buffer overflows. Here’s an example of how you might use fgets():
#include <stdio.h> int main() { char myString[100]; printf("Enter a string: "); // Using fgets() to read up to 99 characters fgets(myString, sizeof(myString), stdin); printf("You entered: %s", myString); return 0; }
In this example:
- A character array myString is defined with a size of 100 characters.
- fgets(myString, sizeof(myString), stdin) reads a line from standard input. It reads up to 99 characters (one less than the size of the array to leave room for the null terminator). It stops reading if a newline character is encountered, which it includes in the array.
- The entered string is then printed using printf().
Key Points About fgets()
- Safety: fgets() is safer than gets() because it allows you to specify the maximum number of characters to read.
- Newline Character: If the input is shorter than the specified length, fgets() includes the newline character in the string.
- Null Termination: fgets() null-terminates the string.
- Return Value: fgets() returns NULL on an error or when end-of-file is encountered without any characters being read.
Unformatted getch() Functions Example in C
The getch() function in C is an unformatted input function commonly used to read a single character from the keyboard without echoing it to the console. This function is not part of the C standard library but is available in some environments, particularly in the conio.h (console input/output) header file used in some DOS and Windows compilers.
getch() is typically used when you want to capture user input immediately without waiting for the user to press Enter and without displaying the typed character on the screen. It’s often used for creating command-line interfaces where immediate response to key presses is needed. Here’s a simple example of how getch() might be used:
#include <conio.h> #include <stdio.h> int main() { char ch; printf("Press a key: "); ch = getch(); // Read a character without echoing printf("\nYou pressed: %c\n", ch); return 0; }
In this example:
- We include conio.h to use getch().
- We read a single character using getch() and store it in the variable ch.
- The character is then printed using printf().
Key Points About getch()
- No Echo: getch() reads a character but does not echo it back to the console.
- Immediate Input: It does not wait for the Enter key to be pressed, making it useful for interactive, immediate-response applications.
- Portability: Since getch() is not part of the standard C library and is primarily found in conio.h, its availability and behavior might vary across different compilers and platforms. It’s typically used in DOS/Windows environments and may not be available on Unix/Linux systems.
- Alternative in Unix/Linux: For similar functionality in Unix/Linux, you might need to use system-specific calls (like ncurses library functions).
In the next article, I will discuss String Predefined Functions in C Language with Examples. In this article, I try to explain Strings in C Language with Examples. I hope you enjoy this Strings in C Language with Examples article. I would like to have your feedback. Please post your feedback, questions, or comments about this Strings in C Language with Examples article.