A guide to PostgreSQL Indexing with SQLAlchemy

Opcito Technologies
2 min readJul 16, 2024

--

Effective database indexing is paramount for ensuring efficient data retrieval and minimizing query execution times. PostgreSQL, a leading relational database management system, provides a comprehensive set of indexing functionalities. This guide explores leveraging SQLAlchemy, a robust Python library for SQL interaction and Object-Relational Mapping (ORM), to create and manage these indexes within your PostgreSQL environment.

sqlalchemy-postgresql-specific-index-options

Basic Indexing

The index=True option allows you to build an index on a specific column. It creates a B-tree index by default on the specified column. B-tree indexes are the default indexing type in PostgreSQL and are suitable for most types of queries, particularly those involving equality comparisons, range queries, and sorting.

from sqlalchemy import JSON, TIMESTAMP, Column, Index, Integer, String,
from data.modelbase import ModelBase
class Publisher(ModelBase):
__tablename__ = 'Publisher'

id = Column(
Integer,
primary_key=True,
autoincrement=True,
)
name = Column(String, nullable=False, index=True)
from sqlalchemy import Index
# Define your table model
class MyModel(Base):
__tablename__ = 'my_table'
id = Column(Integer, primary_key=True)
value = Column(Integer)
# Create a partial index on the 'value' column where 'value' is greater than 100
partial_index = Index('idx_value_gt_100', MyModel.value, postgresql_where=(MyModel.value > 100))
partial_index.create(bind=engine)

Composite Index

In PostgreSQL, you can create indexes on multiple columns within a table. These are called multicolumn indexes. They are also sometimes referred to as composite, combined, or concatenated indexes. A multicolumn index can include up to 32 columns. This limit can be increased if you modify the pg_config_manual.h while you rebuild PostgreSQL from source code.

However, not all types of indexes support multiple columns. Only B-tree, GiST, GIN, and BRIN indexes can be used with multiple columns. To create a multicolumn index, you can specify the columns during table creation. This is typically done by adding an index definition to the __table_args__ property and listing the desired columns within it.

from sqlalchemy import JSON, TIMESTAMP, Column, Index, Integer, String,
from data.modelbase import ModelBase
import datetime
class Employee(ModelBase):
__tablename__ = 'Employee'

id = Column(Integer, primary_key=True, autoincrement=True)
last_name = Column(String, nullable=False)
first_name = Column(String, nullable=False)
birth_date = Column(String)
created_date = Column(TIMESTAMP)

__table_args__ = (
Index('my_index', "last_name", "first_name", postgresql_using="btree"),
)

Let’s look at the syntax for creating a multicolumn index:

CREATE INDEX [IF NOT EXISTS] index_name
ON table_name(column1, column2, ...)

In this syntax:

  • Choose an index name: Use the CREATE INDEX clause followed by a descriptive name for your index. You can also use the IF NOT EXISTS option to avoid errors if the index already exists.
  • Specify the table and columns: Within parentheses after the table name, list the columns you want to inclued in the index.

Order the columns by their usage in WHERE (read more..)

--

--

Opcito Technologies

Product engineering experts specializing in DevOps, Containers, Cloud, Automation, Blockchain, Test Engineering, & Open Source Tech