Good Practices For Testing Database Interactions

Here is what I learnt while testing backend calls to the database on a project using node.js with Loopback and a PostgreSQL database though this article would apply to any technology.

Basically, it all goes back to saying that each test should have full power over its data. Each test must ensure its independence by cleaning the database first and then defining its own dataset according to its needs.

All tests should start by cleaning the database

When testing a function that reads or writes in the database, it is crucial that you know exactly what data it contains. This means that you should setup the dataset yourself and not rely on any previous data it may contain. This also means you need to clean the database before putting any test data in the database.

In our project, we used to clean the data at the end of each test thinking this was a good way to make our tests independent. However, making the assumption of a clean database is nothing else but making all tests dependent. This is actually a bad practice because a test could break because another test changed.

As a matter of fact, we forgot to clean the database after a few tests. As a consequence, other tests that were executed afterwards were failing randomly depending on the order of execution. We would relaunch all tests until they all passed… Then we moved the cleaning of the database at the beginning of the tests so that each test was responsible for its independence. The result of each test became consistent as they would pass or fail always in the same way across executions.

Yes some tests did fail after that. This highlighted the fact that they were poorly written, which leads me to my second point.

All tests should define their own test data

At the risk of repeating myself, it should be clear what data you have in your database at the beginning of a test. Otherwise, your tests will not be easily maintainable.

In our project, we used to update a common set of data and load it for each test.

We soon faced two problems:

By adding, removing, or updating data, we would break other tests or worse make them useless without breaking. For instance, take a function that filters an array of objects depending on some condition. Your test array has two entries: one that will be kept, one that will be removed. If you remove the entry that should have been removed, the test still passes, but becomes useless.
When updating a function, we had to retro-engineer the dataset to find the new result of the test. Indeed, the common dataset was not made to give us useful data. It contained more useless data than useful data.

When it became impossible to update this common dataset, we decided to define an entire new set of data for each new test. This takes time and requires more lines of code, but eventually made us more efficient. We were able to write more tests in the same amount of time and thus write tests for more cases.

Ids of test data should be hard-coded

You want to make your data as easy to use as possible. Fetching data by id will simplify your tests and make them more readable.

We were not doing this on our project because alls ids were auto-incremented by the database. Consider for instance two persons whose names are ‘Person A’ and ‘Person B’. We want to check that Person A gave 100€ to Person B. If we don’t know the ids for personA, personB, bankAccountA and bankAccountB, here is what the test could look like using Loopback.

// In transfer.test.js
it('should transfer 100 euros', function(done) {
  var accountAId, accountBId;
  cleanDatabase()
  .then(function() {
    return Promise.all([
      Person.create({name: 'Person A'}),
      Person.create({name: 'Person B'})
    ]);
  })
  .then(function(createdPersons) {
    var personAId = createdPersons[0].id;
    var personBId = createdPersons[1].id;
    return Promise.all([
      BankAccount.create({personId: personAId, amount: 100}),
      BankAccount.create({personId: personBId, amount: 0})
    ]);
  })
  .then(function(createdAccounts) {
    accountAId = createdAccounts[0].id;
    accountBId = createdAccounts[1].id;
    return transfer('Person A', 'Person B', 100);
  })
  .then(function() {
    return Promise.all([
      BankAccount.findById(accountAId),
      BankAccount.findById(accountBId)
    ]);
  })
  .then(function(fetchedAccounts) {
    expect(fetchedAccounts[0].amount).toEqual(0);
    expect(fetchedAccounts[1].amount).toEqual(100);
    return done();
  })
  .catch(done);
});

Now, if you hard-code ids, here is what this test might look like:

// In transfer.test.js
it('should transfer 100 euros', function(done) {
  cleanDatabase()
  .then(function() {
    return Promise.all([
      Person.create({id: 1, name: 'Person A'}),
      Person.create({id: 2, name: 'Person B'})
    ]);
  })
  .then(function() {
    return Promise.all([
      BankAccount.create({id: 1, personId: 1, amount: 100}),
      BankAccount.create({id: 2, personId: 2, amount: 0})
    ]);
  })
  .then(function() {
    return transfer('Person A', 'Person B', 100);
  })
  .then(function() {
    return Promise.all([
      BankAccount.findById(1),
      BankAccount.findById(2)
    ]);
  })
  .then(function(fetchedAccounts) {
    expect(fetchedAccounts[0].amount).toEqual(0);
    expect(fetchedAccounts[1].amount).toEqual(100);
    return done();
  })
  .catch(done);
});

What we would love to write is actually this:

// In transfer.test.js
it('should transfer 100 euros', function(done) {
  // Setup data
  var data = {
    Person: [
      {id: 1, name: 'Person A'},
      {id: 2, name: 'Person B'}
    ],
    BankAccount: [
      {id: 1, personId: 1, amount: 100},
      {id: 2, personId: 2, amount: 0}
    ]
  };
  cleanDatabase()
  .then(function() {
    feedDatabase(data);
  })
  .then(function() {
    return transfer('Person A', 'Person B', 100);
  })
  .then(function() {
    return Promise.all([
      BankAccount.findById(1),
      BankAccount.findById(2)
    ]);
  })
  .then(function(fetchedAccounts) {
    expect(fetchedAccounts[0].amount).toEqual(0);
    expect(fetchedAccounts[1].amount).toEqual(100);
    return done();
  })
  .catch(done);
});

The feedDatabase function needs to fill table Person before table BankAccount in order to avoid a foreign key constraint violation error on table BankAccount for constraint personId. We write this feedDatabase function in a module that will be common to all tests.

// In common/test_setup.js
const FEED_ORDER = [
  ['Person'],
  ['BankAccount']
];

var feedDatabaseInOrder = function(index, app, data) {
  var line, modelInsertions, modelName;
  if (index === FEED_ORDER.length) {
    return;
  }
  modelInsertions = [];
  for (modelName of FEED_ORDER[index]) {
    if (data.hasOwnProperty(modelName)) {
      for (line of data[modelName]) {
        modelInsertions.push(app.models[modelName].create(line));
      }
    }
  }
  Promise.all(modelInsertions).then(function() {
    return feedDatabaseInOrder(index + 1, app, data);
  });
};

var feedDatabase = function(app, data) {
  return feedDatabaseInOrder(0, app, data);
};

Note that feedDatabase needs to be given app which is the application instance.

Improvement 1: using a data initializer

Let’s improve our example above. Each bank account needs to belong to a bank.

To satisfy this constraint, our data in the test needs to look like:

var data = {
  Bank: [
    {id: 1, name: 'Bank 1', country: 'France', authorizationNumber: '123'}
  ],
  Person: [
    {id: 1, name: 'Person A'},
    {id: 2, name: 'Person B'}
  ],
  BankAccount: [
    {id: 1, personId: 1, bankId: 1, amount: 100},
    {id: 1, personId: 2, bankId: 1, amount: 0}
  ]
};

We also need to add the Bank model to FEED_ORDER in the common module:

// In common/test.setup.js
const FEED_ORDER = [
  ['Person', 'Bank'],
  ['BankAccount']
];

However, the bank to which the bank accounts belong has no impact on our transfer function. We would like to keep in the test only what is meaningful.

In another common file, let’s define a data initializer. It should contain no business data but a default value for each model with id 1 by convention. The initializer is not meant to be used as test data. Its aim is only to help satisfy foreign key constraints.

// In common/data_initializer.js

const DATA_INITIALIZER = {
  Bank = [
    {id: 1, name: 'Bank 1', country: 'France', authorizationNumber: '123'}
  ],
  BankAccount: [
    {id: 1, personId: 1, bankId: 1, amount: 1}
  ],
  Person: [
    {id: 1, name: 'Person A'}
  ]
};

var getDefaultData = function() {
  // This clones DATA_INITIALIZER so that it cannot be altered
  return JSON.parse(JSON.stringify(DATA_INITIALIZER));
};

In our test, we can now write:

var data = getDefaultData()
data.BankAccount = [
  {id: 1, personId: 1, bankId: 1, amount: 100},
  {id: 1, personId: 2, bankId: 1, amount: 0}
];
data.Person = [
  {id: 1, name: 'Person A'},
  {id: 2, name: 'Person B'}
];

It’s important that you override BankAccount and Person to be able to see all useful data within the test itself and not be dependent on default values such as the default bank account amount.

Improvement 2: resetting id sequences

When hard-coding ids as we did, any further attempt to insert a new entry in the database without hard-coding its id will fail. Indeed, while we inserted data with hard-coded ids, the id sequences were never updated. The database will automatically try to insert the new entry with id 1, which is already used.

The best way to deal with this problem is to reset the id sequence manually. The most transparent is probably to restart the id sequence at the max id of all inserted rows.

// In common/test_setup.js

var updateIdSequences = function(app) {
  var datasourceConnector, table, tables, updates;
  datasourceConnector = app.models[FEED_ORDER[0][0]].dataSource.connector;
  updates = [];

  for (tables of FEED_ORDER) {
    for (table of tables) {

      var tableName = table.toLowerCase();
      var sequence = tableName + '_id_seq';

      updates.push(new Promise(function(resolve, reject) {
        var findMaxIdQuery = 'SELECT MAX(id) FROM ' + table + ';';

        return datasourceConnector.query(findMaxIdQuery, [], getCallback(sequence, dataSourceConnector, reject, resolve));
      }));
    }
  }
  return Promise.all(updates);
};

function getCallback = (sequence, dataSourceConnector, reject, resolve) {
  return function (err1, res) {
    if (err1) { return reject(err1); }

    if (res[0].max != null) {
      var updateIdSequenceQuery = 'ALTER SEQUENCE ' + sequence + ' RESTART WITH ' + (res[0].max + 1) + ';';

      return datasourceConnector.query(updateIdSequenceQuery, [], function(err2) {
        if (err2) { return reject(err2); }
        resolve();
      });
    } else {
      resolve();
    }
  };
}

var feedDatabase = function(app, data) {
  return feedDatabaseInOrder(0, app, data)
  .then(function() {
    // ------------> update all id sequences
    updateIdSequences(app);
  });
};

Improvement 3: debugging the feedDatabase function

When using the feedDatabase function, I once had trouble understanding a bug in one of my tests. I had forgotten to add the new model I had just created to the FEED_ORDER constant. I made a small change to feedDatabase in order to count the number of inserted models. The function now returns an explicit error when one model of data has not been used.

// In common/test_setup.js
var feedDatabaseInOrder = function(index, app, data, countInsertedModels) {
  var line, modelInsertions, modelName;
  if (index === FEED_ORDER.length) {
    return;
  }
  modelInsertions = [];
  for (modelName of FEED_ORDER[index]) {
    if (data.hasOwnProperty(modelName)) {

      // ------------> increment model count
      countInsertedModels++;

      for (line of data[modelName]) {
        modelInsertions.push(app.models[modelName].create(line));
      }
    }
  }
  Promise.all(modelInsertions).then(function() {
    return feedDatabaseInOrder(index + 1, app, data, countInsertedModels);
  });
};

var feedDatabase = function(app, data) {
  return feedDatabaseInOrder(0, app, data, 0)
  .then(function(countInsertedModels) {
    // ------------> throw error if counts don't match
    if(countInsertedModels != Object.keys(data).length) {
      throw new Error('Some model forgotten in FEED_ORDER in common/test_setup.js');
    }
  })
  .then(function() {
    updateIdSequences(app);
  });
};

What we get in the end

common/test_setup.js

const FEED_ORDER = [
  ['Person', 'Bank'],
  ['BankAccount']
];

var feedDatabaseInOrder = function(index, app, data, countInsertedModels) {
  var line, modelInsertions, modelName;
  if (index === FEED_ORDER.length) {
    return;
  }
  modelInsertions = [];
  for (modelName of FEED_ORDER[index]) {
    if (data.hasOwnProperty(modelName)) {
      countInsertedModels++;

      for (line of data[modelName]) {
        modelInsertions.push(app.models[modelName].create(line));
      }
    }
  }
  Promise.all(modelInsertions).then(function() {
    return feedDatabaseInOrder(index + 1, app, data, countInsertedModels);
  });
};

var updateIdSequences = function(app) {
  var datasourceConnector, table, tables, updates;
  datasourceConnector = app.models[FEED_ORDER[0][0]].dataSource.connector;
  updates = [];

  for (tables of FEED_ORDER) {
    for (table of tables) {

      var tableName = table.toLowerCase();
      var sequence = tableName + '_id_seq';

      updates.push(new Promise(function(resolve, reject) {
        var findMaxIdQuery = 'SELECT MAX(id) FROM ' + table + ';';

        return datasourceConnector.query(findMaxIdQuery, [], getCallback(sequence, dataSourceConnector, reject, resolve));
      }));
    }
  }
  return Promise.all(updates);
};

function getCallback = (sequence, dataSourceConnector, reject, resolve) {
  return function (err1, res) {
    if (err1) { return reject(err1); }

    if (res[0].max != null) {
      var updateIdSequenceQuery = 'ALTER SEQUENCE ' + sequence + ' RESTART WITH ' + (res[0].max + 1) + ';';

      return datasourceConnector.query(updateIdSequenceQuery, [], function(err2) {
        if (err2) { return reject(err2); }
        resolve();
      });
    } else {
      resolve();
    }
  };
}

var feedDatabase = function(app, data) {
  return feedDatabaseInOrder(0, app, data, 0)
  .then(function(countInsertedModels) {
    if(countInsertedModels != Object.keys(data).length) {
      throw new Error('Some model forgotten in FEED_ORDER in common/test_setup.js');
    }
  })
  .then(function() {
    updateIdSequences(app);
  });
};

common/data_initializer.js

const DATA_INITIALIZER = {
  Bank = [
    {id: 1, name: 'Bank 1', country: 'France', authorizationNumber: '123'}
  ],
  BankAccount: [
    {id: 1, personId: 1, bankId: 1, amount: 1}
  ],
  Person: [
    {id: 1, name: 'Person A'}
  ]
};

var getDefaultData = function() {
  // This clones DATA_INITIALIZER so that it cannot be altered
  return JSON.parse(JSON.stringify(DATA_INITIALIZER));
};

transfer.test.js

it('should transfer 100 euros', function(done) {
  // Setup data
  var data = getDefaultData()
  data.BankAccount = [
    {id: 1, personId: 1, bankId: 1, amount: 100},
    {id: 1, personId: 2, bankId: 1, amount: 0}
  ];
  data.Person = [
    {id: 1, name: 'Person A'},
    {id: 2, name: 'Person B'}
  ];

  cleanDatabase()
  .then(function() {
    feedDatabase(app, data);
  })
  .then(function() {
    return transfer('Person A', 'Person B', 100);
  })
  .then(function() {
    return Promise.all([
      BankAccount.findById(1),
      BankAccount.findById(2)
    ]);
  })
  .then(function(fetchedAccounts) {
    expect(fetchedAccounts[0].amount).toEqual(0);
    expect(fetchedAccounts[1].amount).toEqual(100);
    return done();
  })
  .catch(done);
});

Good Practices For Testing Database Interactions

All tests should start by cleaning the database

All tests should define their own test data

Ids of test data should be hard-coded

Improvement 1: using a data initializer

Improvement 2: resetting id sequences

Improvement 3: debugging the feedDatabase function

What we get in the end

Gireg de Kerdanet

Web Developer at Theodo

Theodo France

Liked this article?