Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Replace all occurrence of specific words in a sentence based on an array of words in JavaScript
We are required to write a JavaScript function that takes a string and an array of strings.
Our function should return a new string, where all the occurrences of the word in the string that are present in the array are replaced by a whitespace.
Our function should use the String.prototype.replace() method to solve this problem.
Understanding the Problem
When filtering words from a sentence, we need to:
- Match whole words only (not parts of words)
- Handle case-insensitive matching
- Use regular expressions for efficient replacement
Example Implementation
Here's a complete solution using String.prototype.replace() with regular expressions:
var excludeWords = ["A", "ABOUT", "ABOVE", "ACROSS", "ALL", "ALONG", "AM",
"AN", "AND", "ANY", "ASK", "AT", "AWAY", "CAN", "DID", "DIDN'T", "DO",
"DON'T", "FOR", "FROM", "HAD", "HAS", "HER", "HIS", "IN", "INTO", "IS",
"IT", "NONE", "NOT", "OF", "ON", "One", "OUT", "SO", "SOME", "THAT",
"THE", "THEIR", "THERE", "THEY", "THESE", "THIS", "TO", "TWIT", "WAS",
"WERE", "WEREN'T", "WHICH", "WILL", "WITH", "WHAT", "WHEN", "WHY"];
var sentence = "The first solution does not work for any UTF-8 alphaben. I have managed to create function which do not use RegExp and use good UTF-8 support in JavaScript engine. The idea is simple if symbol is equal in uppercase and lowercase it is special character. The only exception is made for whitespace.";
const removeExcludedWords = (str, words) => {
// Create regex pattern with word boundaries for exact word matching
const regex = new RegExp(`\b(${words.join('|')})\b`, 'gi');
// Replace matched words with empty string
return str.replace(regex, "");
};
console.log(removeExcludedWords(sentence, excludeWords));
first solution does work UTF-8 alphaben. I have managed create function use RegExp use good UTF-8 support JavaScript engine. idea simple if symbol equal uppercase lowercase special character. only exception made whitespace.
How It Works
The solution uses several key components:
-
\b- Word boundaries ensure we match complete words only -
words.join('|')- Creates an alternation pattern (word1|word2|word3) -
'gi'flags - Global and case-insensitive matching -
replace(regex, "")- Replaces all matches with empty string
Alternative Approach with Space Replacement
If you prefer to replace excluded words with spaces instead of removing them entirely:
const removeExcludedWordsWithSpaces = (str, words) => {
const regex = new RegExp(`\b(${words.join('|')})\b`, 'gi');
return str.replace(regex, " ").replace(/\s+/g, " ").trim();
};
var testSentence = "The quick brown fox jumps over the lazy dog";
var commonWords = ["the", "over"];
console.log("Original:", testSentence);
console.log("Filtered:", removeExcludedWordsWithSpaces(testSentence, commonWords));
Original: The quick brown fox jumps over the lazy dog Filtered: quick brown fox jumps lazy dog
Key Points
- Word boundaries (
\b) prevent partial word matches - Case-insensitive flag (
'i') handles different capitalizations - Global flag (
'g') replaces all occurrences, not just the first - The pipe operator (
|) creates an OR pattern in regex
Conclusion
Using String.prototype.replace() with regular expressions provides an efficient way to remove multiple words from a sentence. The key is using word boundaries and proper regex flags for accurate matching.
