NLP Part 4: Stop Words

Chitra's Playground
2 min readSep 6, 2024

--

Stop words are common words that are often removed from text before analysis, as they don’t provide much meaningful information. Stop words can be insignificant words, like “the,” “and,” and “in,” that need to be removed from text data to focus on more important terms.

print(nlp.Defaults.stop_words)

The code will show you a list of common words, known as stop words, that are often ignored in text analysis. You can easily add or remove words from this list. Let’s verify if “btw” is part of the stop words list before adding it.

nlp.vocab['btw'].is_stop

We looked through the list of stop words and couldn’t find “btw,” so we can go ahead and include it.

#add stop words
nlp.Defaults.stop_words.add('btw')
nlp.vocab['btw'].is_stop = True
nlp.vocab['btw'].is_stop

This code adds “btw” to the list of stop words. The first line tells the program to do this. The second line actually puts “btw” on the list. The last line checks if “btw” is now on the list. When you run it, it should say “True” because “btw” was added.

Now, let’s try removing “ca” from the list.

#remove stop words
nlp.Defaults.stop_words.remove('ca')
nlp.vocab['ca'].is_stop = False
nlp.vocab['ca'].is_stop

The first line says we want to take “ca” off the stop words list. The second line tells the program that “ca” is no longer a stop word. The third line checks if “ca” is still on the list. It should say “False” because we removed it.

As you can see, working with stop words in natural language processing is a straightforward process. By understanding how to add and remove stop words, you can customize the analysis of your text data to better suit your specific needs. This can be particularly useful when dealing with specialized domains or languages where certain words might be more or less relevant than in general-purpose text.

--

--

Chitra's Playground
Chitra's Playground

Written by Chitra's Playground

Tech enthusiast with a passion for machine learning & eating chicken. Sharing insights on my learning journey.

No responses yet