SQL is great! It can handle complex queries involving multiple tables, aggregations, subqueries, and joins. It can perform CRUD (Create, Read, Update, Delete) operations. It enforces data integrity through constraints like primary and foreign keys. It supports transactions, allowing multiple operations to be executed as a single unit, which can be rolled back if one of the operations fails.

Despite all its strengths, which are too many to list here, SQL can be awkward to integrate with host languages such as JavaScript and Python. There’s often an impedance mismatch between SQL’s declarative nature and the host language’s object-oriented or functional paradigms since SQL queries are typically written as strings within the host language, making it difficult to apply composable programming techniques.

Several solutions have been developed over the years to address the impedance mismatch, with ORM (Object-Relational Mapping) libraries being one of them. ORMs such as Java’s Hibernate are widely used to model a relational database using objects, but there are various difficulties in making them work well.

Fortunately, there exists a category of libraries called query builders like Knex for Node.js. Query builders allow the construction of queries programmatically without resorting to string concatenation and, unlike ORMs, they don’t suffer from the object-relational impedance mismatch.

Knex

To demonstrate how a query builder works, let’s model a simple discussion forum. To keep things simple, we will use just three tables: post, topic and user.


The database tables for a simple discussion forum. The database tables for a simple discussion forum.

To retrieve all the posts in a topic along with information about the posters, we can use the following piece of code:

const query = require('knex')({
    client: 'pg',
    connection: 'postgresql://localhost/my_forum',
});

const getPostsInTopic = (topicId) =>
    query
        .from('post')
        .select('post.*', 'user.username', 'user.country')
        .join('user', 'user.id', 'post.user_id')
        .where('post.topic_id', topicId)
        .orderBy('post.id');

And the result should look something like this:

[
  {
    id: 1210,
    user_id: 327,
    topic_id: 42,
    content: 'Hi there!',
    created_at: '2024-11-22T14:34:26.454Z',
    updated_at: '2024-11-22T14:34:26.454Z',
    username: 'Asuka',
    country: 'JP'
  },
  // ...
]

Knex uses a fluent interface to build queries, making it possible to modify them by simply chaining additional methods. In the example below, we’ve made retrieving details about posters optional:

const getPostsInTopic = (topicId, withPosterDetails = false) => {
    let myQuery = query
    	.from('post')
    	.select('post.*')
    	.where('post.topic_id', topicId)
    	.orderBy('post.id');

    if (withPosterDetails) {
        myQuery = myQuery
        	.select('user.username', 'user.country')
        	.join('user', 'user.id', 'post.user_id');
    }

    return myQuery;
};

Pipelines

Knex’s built-in object-oriented fluent interface works well, but for those who prefer a functional-style approach and the advantages that come with it, it’s possible to use an alternative interface based on pipelines.

“It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.” – Alan Perlis

It’s surprisingly simple to implement a generic pipeline function that accepts multiple functions as input, passes an initial value to the first function, and executes the functions sequentially, passing the output of each function to the next:

const pipe = (...functions) => (initialValue) =>
	functions.reduce((accumulator, fn) => fn(accumulator), initialValue);

We will also need an identity function, which simply returns its input, acting similarly to a no-operation (no-op) function.

const identity = (x) => x;

We can now utilize the pipe function to generate our SQL query using only function composition. The query modifier functions below return functions that accept the query object as their input:

const postTable = query.from('post');

const postsInTopic = (topicId) => (query) => query
	.select('post.*')
	.where('post.topic_id', topicId);

const includePosterDetails = () => (query) => query
	.select('user.username', 'user.country')
	.join('user', 'user.id', 'post.user_id');

const orderBy = (...fields) => (query) => query.orderBy(...fields);

const getPostsInTopic = (topicId, withPosterDetails = false) =>
    pipe(
        postsInTopic(topicId),
        withPosterDetails ? includePosterDetails() : identity,
        orderBy('post.id')
    )(postTable);

Summary

We have used a query builder to programmatically build SQL queries, and by using higher-order functions (functions that take other functions as arguments) and pipelines, we have:

  • Eliminated the need to pass the query object explicitly.
  • Made it easier to add, remove, or rearrange query modifiers without disrupting the overall structure.
  • Improved maintainability and simplified testing since each query modifier function is a self-contained unit that can be reused across various queries and scenarios.

Related: