LIKE an apple-pie ORDER BY

Postgres derailed indexes. Part 2

AlekseyL
3 min readNov 24, 2017

How to miss a GIN index in your queries you can read in Yo-ho and the bottle of GIN. This one is example how to miss a b-tree index using varchar_pattern_ops/text_pattern_ops in non-C locale environment.

Autocomplete case

The best reallife case scenario would be an autocomplete implementation. Assume you need a fast autocompletion on some text column. User typing a text and an application delivers ten nearest words/phrases alphabetically sorted, like when you do googling.

Fulltext search problems

This can be done with full text search using tsvector and ts_queries, they can do prefix search. But FTS may be highly inefficient for short piece of string: starting with 1–2 symbols is practically impossible to deliver properly fast on large sets. You may try to solve this with trigrams but still you’ll got second problem for FTS: if you deliver sorted and limited results to user, Postgres will need to sort all record set for a given prefix-condition, since GIN search cannot deliver sorted sets. So even if you use prefix with 3 and more symbols you still need to sort results alphabetically, which may take an unacceptable amount of time.

LIKE search%

LIKE prefix% query to the rescue? Yes, but here goes some unexpected problems. Lets start with examples.

CREATE TABLE words (
id serial,
word varchar
);

Lets create index for it.

CREATE INDEX index_words_word ON words(word);

Example dataset size: 420K mix of russian and english unique words.

Lets try to do some search:

Bad news: index not used.

It’s not like it’s completly not used, it’s not LIKE used:

Finding out what’s wrong is simple, you can do some googling and find the solution:

  1. Text search strategies in postgresql.
  2. Posgresql LIKE-query perfomance variations
  3. PgSQL Indexes and “LIKE”
  4. http://www.efrag.gr/2011/07/postgresql-indexes/

and e.t.c. We missed varchar_pattern_ops/text_pattern_ops when we create our index.

Right index will look like this:

CREATE INDEX index_words_word ON words(word text_pattern_ops);

Rerun query:

Yeah! It work, lets order by word and we are done!

This is one tiny thing all this tutors and solutions always forgot to mention.

ORDER BY … USING

Actually index is correct and works fine, the only thing is broken here is an ORDER BY clause.

If you try to find standalone description for a PostgreSQLs ORDER BY clause you’ll find yourself in quite of a situation: every 2 from a given 3 descriptions of ORDER BY are incomplete, even original Postgres documentation about row ordering doesn’t contain desired info. I’m talking about USING part of ORDER BY clause:

ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...]

Here are tutors/docs that missing USING:

  1. https://www.postgresql.org/docs/10/static/queries-order.html
  2. https://www.tutorialspoint.com/postgresql/postgresql_order_by.htm
  3. http://www.postgresqltutorial.com/postgresql-order-by/
  4. https://www.postgresql.org/docs/10/static/queries-order.html

Here are the solid ones with USING:

  1. https://www.postgresql.org/docs/10/static/sql-select.html#SQL-ORDERBY
  2. https://w3resource.com/PostgreSQL/postgresql-order-by.php

So what is all the fuss about, how USING solve the problem? Lets start with rewriting original query with USING

Yeah, it looks like the one we already had previously. So now if you observant enough you may notice that index conditions use special set of operators when comparing values! Index uses ~<~, and the ORDER BY clause use < !

Lets try to apply this insight:

That’s what are we looking for! As you already may understand index is created based on one operators class (text_pattern_ops) and we were trying to order on default one (text_ops). More about operators classes in Postgres could be found here.

Rails perspective

To create index with text_patter_ops in rails you can use raw SQL inside execute block in migration:

reversible do |dir|
dir.up do
execute <<-INDEX
CREATE INDEX index_words_word ON words(word varchar_pattern_ops);
INDEX
end
dir.down do
execute 'DROP INDEX index_words_word'
end
end

in rails 5 you have more options:

create_table :words do |t|
t.string :word
t.index 'lower(word) varchar_pattern_ops'
end
# OR in changedef change
add_index :words, 'lower(word) varchar_pattern_ops',
name: "index_words_word"
end

Search query with order:

Word.where('word LIKE ?', "#{prefix}%")
.order('ORDER BY word USING ~<~')
.limit( 10 )

Resume

For non-C locales using b-tree index on text/varchar may be little confusing, but for now you are completly aware of possible pitfalls.

Additional links

  1. ORDER BY … USING - nice stackoverflow description
  2. varchar_pattern_ops VS text_pattern_ops ( TL;DR they are the same)
  3. Proper prefix search with indexes
  4. Operator Classes and Operator Families
  5. Erwin Brandstetter’s answer on some ORDER BY and text_pattern_ops problems with cool links as usual
  6. Pattern matching in Postgres also by Erwin

--

--

AlekseyL

Chief Software Architect / CTO, Ruby and PostgresSQL fan.