Chapter 1. Introduction: What is DBBalancer?

DBBalancer is some sort of middleware daemon that would sit in between of database clients, like C, C++, TCL, Java JDBC, Perl DBI, etc programs and a database server. Currently the only server supported is PostgreSQL, but the architecture is open to embrace more servers in a future.

To a client connecting to DBBalancer, the system offers a

Features

The main feature is an implementation of a database connection pool, but DBBalancer can do different things:

A database connection pool

As every database connection pool, DBBalancer trades memory for speed, opening all needed connections at startup and thus saving client request processing time. This performance benefit from this kind of arrangement is bigger when the backend database takes a relative long time to init a connection, like PostgreSQL or Oracle. The best case is when the actual processing time (sentence executing and data retrieval) is much smaller than the connection setup time. The worst case is if connections are kept open by the clients during a long time. In this case we could even lose performance by using this (or another pool).

A load balancer

Escalability and fault tolerance have always been two requisites for enterprise systems. The arrival of the web made many other systems avalaible to a ever growing number of users. Many of these systems aren't run by big enterprises so they prefer free and open source software rather than expensive commercial one. The problem is the lack of really escalable and safe hardware and software with those requisites.

The solution that most implement is the clustering of several machines. DBBalancer allows to spread the database load between several backends, thus solving the escalability and fault tolerance problem. But while we have a problem solved, a new one appeared. We have to keep the databases in sync if we want our clients to access any of them randomly and transparently. Here is where replication solutions make their appearance.

One possibility here is using the built-in replication support that some databases, like Postgres, offer now or will offer in a future. But Postgres replication, for example is quite new and limited so we may try the next point: DBBalancer Database Replicator.

A database replicator

Here we follow a simple method: Send a copy of the client input to every backend, and then consolidate all the backends output in just one, that's sent back to the only client. If the data received differs, a sync error is returned to the client. This method should work in most cases, with the drawback of using a different port for writing connections.

It must be noted that I don't consider this replication solution to be a definitive one. A good and efficient replication belongs to the database backend. But as we don't have a working, stable and free solution of this kind at the date of this writing, I offer this possibility as a temporary workaround. That's another reason for implementing a "two daemon" solution, instead of a more integrated one.

All of this functionalities can be combined and used simultaneously, just depending on the way configuration flags and file are used.

And even better is that, unlike some other solutions, all this can be achieved without changing the original code, as DBBalancer accepts the usual (by now just Postgres) clients: libpq, JDBC, libpgtcl, DBI, etc. It only would need some changes to use balancing and replication simultaneously, as this would require clients to use different ports to get "load-balancing" or "replication" connections.