SQL: Matching text across multiple fields

There are some very advanced methods of doing text searches in databases, but there are cases where you need to stick with using regular SQL to find text matches. However, there are challenges when doing more than simple searches. One case would be if you need to find rows where multiple strings exist at least once across multiple fields.

Let’s take this example, which is a table of books, containing a published date, the name of the author, and a snippet of text from the book. In this example, I’m using an SQLite database, but the core concept should apply to other databases, including MySQL.

CREATE TABLE books (
    bookid INT PRIMARY KEY,
    published TEXT,
    author TEXT,
    booktext TEXT
);
INSERT INTO books (bookid, published, author, booktext) VALUES 
(1, "Dec 1, 2003", "John Smith", "A comprehensive guide to Math"),
(2, "Jan 15, 2014", "Jane Doe", "Rabbits of North America"),
(3, "Mar 20, 1990", "George Ham", "A guide to cooking outdoors"),
(4, "Mar 25, 1991", "Stephanie Waters", "The Science of Math"),
(5, "Dec 1, 2000", "Jane Matheson", "Grade 12 Science");

What we want to solve here is a situation where than one phrase is provided, and we need to return rows where every phrase exists at least once in the row.

Take for example “Dec” and “Math”. Looking at the data, we can see that rows 1 and 5 both contain “Dec” and “Math”. We don’t want to return row 4 because it doesn’t include “Dec”.

Solving this with a series of published LIKE “%Math%” commands, grouped using ANDs and ORs is impractical, confusing, if it is even possible at all. Lucky for us, there is a better solution: A sub-query, using the IN command. Doing this allows us to run multiple queries (you will likely be dynamically generating your SQL based on the input) on the same table, and returning only those where the row matches in every case. Here’s the SQL:

SELECT *
   FROM books
  WHERE bookid IN (
     SELECT bookid 
       FROM books 
      WHERE published LIKE "%Dec%"
         OR author LIKE "%Dec%"
         OR booktext LIKE "%Dec%"
    )
    AND bookid IN (
     SELECT bookid 
       FROM books 
      WHERE published LIKE "%Math%"
         OR author LIKE "%Math%"
         OR booktext LIKE "%Math%"   
    );

Consider what each of the IN sub-queries returns. The first one, looking for “Dec”, returns 1 and 5. The second, looking for “Math”, returns 1, 4, and 5. The outside query is looking for cases where bookid is in both, and therefore only rows 1 and 5 are returned.

It should be noted that this method of searching isn’t very efficient, and should be limited to cases where the volume of text and number of records is small. For something more robust, you should use full text search functionality. SQLite has one built in: SQLite FTS. But for cases where it isn’t available, or you are dealing with text that doesn’t work well with full text search indexing, this method might help you solve your problem.