Have you ever gotten frustrated running search queries against a database and it won't find what you're looking for? One reason for the lack of success could be that you are not aware of the “stop words” (also called “noise words”) in your particular database system.
Over the years, I have had frustrated paralegals and attorneys ask for my help with database searches. Most of time, the issue is that they are trying to include “stop words” in their search query.
“Stop words” are words that do not contain important significance or that have very little meaning. Examples of common stop words:
An example of a problematic search query would be if you want to search for “President of the United States”. The query contains two stop words (of, the). Instead, I would search for (president w/3 “united states”). Using w/3 is a common proximity search technique that would find the terms within three words of each other.
You could also search for (president AND “united states”). The AND in this instance is a “boolean search operator”, not a search term. This query would find documents that contain both terms, even if they are not in close proximity to each other, which may not be what you want.
In a previous article entitled An Index Disguised as a List, I mentioned a “database index”. When a database index is created, “stop words” are typically excluded from the index. This means we can't search for words that are not there.
There are many variables when it comes to “stop words”. The default list of “stop words” is not consistent across all database systems. In fact, some database systems do not use “stop words” as a default. In addition, some database systems will ignore “stop words” altogether, while other database systems will yield zero hits.
Most database search engines will include a default list of “stop words”. This default list can then be customized by the service provider, by the client, by the matter, or by personal preferences.
My advice is to find out what the “stop words” are before running any searches. If there are words in the “stop words list” that you know are pertinent to the case, which means the legal team will expect to be able to search for them, those words can be removed.