Back to: C Tutorials For Beginners and Professionals
How to Find Duplicates in a String in C Language with Examples
In this article, I will discuss How to Find Duplicates in a String in C Language with Examples. Please read our previous article discussing How to Compare String and Checking Palindrome in C Language with Examples. At the end of this article, you will understand the following three approaches to finding the duplicates in a string:
- Finding Duplicates in a String by Comparing with Other Letters in C
- Finding Duplicates by Using a Hash Table or Counting in C
- Find Duplicates in a String using Bitwise Operations in C
Finding Duplicates in a String:
Finding duplicates in a string in C involves checking each character of the string against every other character. Now, we will see how to find duplicate characters in a string. We must find out if there are any duplicate alphabets or if any alphabet or character is repeating more than once in a string. For a better understanding, please observe the following string.
As you can see in the above string, the alphabet i is repeated more than once. We can use the following 3 methods for finding duplicate elements (elements in the sense characters or alphabets) in a string:
- Compare With Other Letters: To find duplicates in a string by comparing each letter with the others in C, you will need to use two nested loops. The outer loop will iterate through each character in the string, and the inner loop will compare that character with every other character in the string.
- Using HashTable or Counting: Using a hash table or a counting array in C to find duplicates in a string is an efficient method, especially for character sets like ASCII, where the total number of possible characters is known and manageable.
- Using Bits: To find duplicates in a string using bitwise operations in C, you can utilize bit manipulation techniques. This method is particularly efficient when dealing with strings containing only lowercase alphabetic characters (‘a’ to ‘z’).
Approach 1: Finding Duplicates in a String by Comparing with Other Letters in C
You will need to use two nested loops to find duplicates in a string by comparing each letter with the others in C. The outer loop will iterate through each character in the string, and the inner loop will compare that character with every other character in the string. Here’s a step-by-step approach and an example:
- Outer Loop: Iterate through each character in the string using an index variable, say i.
- Inner Loop: For each character at index i, iterate through the rest of the string with another index variable, say j, starting from i + 1.
- Comparison: Compare the character at index i with each character at index j. If they match, you’ve found a duplicate.
- Avoid Reprinting Duplicates: To avoid printing the same duplicate character multiple times, you can break the inner loop once a duplicate is found.
Here’s a simple C program demonstrating this approach:
In the following example, the function findDuplicates checks for duplicate characters in the string “programming”. When it finds a duplicate, it prints that character and then checks the rest of the string.
#include <stdio.h> #include <string.h> void findDuplicates(char* str) { int len = strlen(str); // Iterate over each character for (int i = 0; i < len - 1; i++) { int foundDuplicate = 0; // Compare with the rest of the characters for (int j = i + 1; j < len; j++) { if (str[i] == str[j]) { foundDuplicate = 1; break; // Stop searching once a duplicate is found } } if (foundDuplicate) { printf("Duplicate character found: %c\n", str[i]); } } } int main() { char str[] = "programming"; printf ("String is \"%s\"\n", str); findDuplicates(str); return 0; }
Output:
This method is straightforward but not the most efficient, especially for long strings, as it has an O(n²) time complexity. For more efficient methods, you might consider sorting the string first or using a hash table to track character occurrences.
Approach 2: Finding Duplicates by Using a Hash Table or Counting in C
To find duplicates in a string using a hash table or a counting array in C, you need to follow this approach:
- Initialize a Counting Array: This array will serve as a simple hash table where each index corresponds to a character. The value at each index will represent the number of times that character has appeared in the string. For ASCII characters, an array of size 256 is sufficient.
- Iterate Over the String: Go through each character in the string and use its ASCII value as the index in the counting array. Increment the value at this index for each occurrence of the character.
- Identify Duplicates: After processing the entire string, iterate through the counting array. Any index with a value greater than 1 indicates that the corresponding character is a duplicate.
Here’s an example of a C program implementing this approach:
#include <stdio.h> #include <string.h> #include <ctype.h> // for tolower function #define MAX_CHAR 256 // Maximum number of characters (ASCII) void findDuplicates(char *str) { int count[MAX_CHAR] = {0}; // Initialize all elements to 0 // Increment count for each character for (int i = 0; str[i] != '\0'; i++) { count[(unsigned char)str[i]]++; } // Check for duplicates printf("Duplicate characters in the string:\n"); for (int i = 0; i < MAX_CHAR; i++) { if (count[i] > 1) { printf("'%c' appears %d times\n", i, count[i]); } } } int main() { char str[] = "Programming"; printf ("String is \"%s\"\n", str); // Remove newline character if present if (str[strlen(str) - 1] == '\n') { str[strlen(str) - 1] = '\0'; } findDuplicates(str); return 0; }
Output:
In this program:
- MAX_CHAR is defined as 256, covering the entire ASCII characters range.
- The findDuplicates function creates an array count of size MAX_CHAR. Each index of the array corresponds to an ASCII value of a character.
- The function iterates over the input string, incrementing the count of each character.
- It then iterates over the count array and prints out characters that appear more than once, indicating duplicates.
This method is efficient regarding time complexity but uses extra space for the counting array. It’s ideal for strings where the character set is known and limited, like ASCII characters. A different approach may be needed if the character set is Unicode or another extensive set due to memory constraints.
Approach 3: Find Duplicates in a String using Bitwise Operations in C
You can utilize bit manipulation techniques to find duplicates in a string using bitwise operations in C. This method is particularly efficient when dealing with strings containing only lowercase alphabetic characters (‘a’ to ‘z’). Here’s how you can implement it:
- Create a Checker Variable: This variable will use individual bits to represent whether a character has been seen before. Each bit corresponds to one character of the alphabet, with the least significant bit representing a, the next bit representing b, and so on.
- Iterate Through the String: For each character, calculate the corresponding bit’s position using bit shifting.
- Check for Duplicates: Use bitwise AND to check if the bit at that position is already set. If it is, the character is a duplicate.
- Mark the Character as Seen: Use bitwise OR to set the bit corresponding to that character in the checker variable.
Here is a C program to demonstrate this approach:
#include <stdio.h> void findDuplicates(char *str) { int checker = 0; for (int i = 0; str[i] != '\0'; i++) { int val = str[i] - 'a'; // Calculate the position of the character // Check if the bit at 'val' position is already set if ((checker & (1 << val)) > 0) { printf("Duplicate character found: %c\n", str[i]); } else { // Set the bit for this character in checker checker |= (1 << val); } } } int main() { char str[] = "programming"; printf ("String is \"%s\"\n", str); findDuplicates(str); return 0; }
Output:
Important Notes:
- This method only works for strings containing lowercase letters ‘a’ to ‘z’. It’s not suitable for uppercase letters, numbers, special characters, or extended character sets.
- The space complexity is very low, as it uses a single integer (int checker) for tracking.
- The function assumes the input string is null-terminated, as per C standards.
- If the input string can contain characters outside ‘a’ to ‘z’, this method needs to be modified, or a different approach should be used.
In the next article, I will discuss How to Find Duplicates in a String using Bitwise Operations in C Language with Examples. In this article, I try to explain How to Find Duplicates in a String in C Language with Examples. I hope you enjoy this article, “How to Find Duplicates in a String in C Language with Examples.” I would like to have your feedback. Please post your feedback, questions, or comments about this article.