What Developers Need to Know about SQL – Set Based Programming

Here is one stand alone tweet from What Developers Need to Know About SQL Server, which is rather information dense and worth expanding on:

Stuart Miller compares the mental shift required for someone who is used to OOP do develop in SQL to the mental shift those used to imperative programming (c, COBOL, Basic and more) need to go through in order to truly understand and use OOP correctly.

This different programming paradigm is set based programming, taking its name from set theory, a mathematical framework from which the relational model of databases was derived by Edgar F. Codd, which in turn gave relational databases their name.

One of the major differences between set based programming and object oriented programming has to do with how they deal with groups of related items. Where in object oriented programming, one would iterate over a collection of objects, dealing with each item at a time, set based programming deals with the group as a whole – there are no iteration constructs. That’s right – there are no loops in set based programming.

As result, the way to think about and apply operators is in groups of related items (sets, or relations, in Codds’ relational model), not individual items. Operators allow filtering, aggregating, projection, joining and more – always resulting in groups of related items (where a group can be empty or consist of a single item).

Relational theory is the basis from which relational databases have been built and the relational engines that power them are optimized for this kind of thinking – set based thinking. Most relational database do offer looping constructs – but these are most often than not best avoided as they go against the grain and can be the cause of performance issues. This is not because relational databases are slow – it is because using loops instead of operating on sets goes against how they were built and what they are optimized to do.

2 Responses

  1. saj14saj April 11, 2014 / 17:49

    I am not sure the assertion that one should change to set based programming makes universal sense. Sometimes it does, sometimes it doesn’t.

    It absolutely makes sense in combination with functional programming as one might do in F#, and in my cases in C# things are more easily expressed with projection or filtering and so on.

    Of course it makes sense in the layers that actually communicate with the backing database, if that database is relational.

    On the other hand, client side operations may or may not need to be set based.

    Many of the LINQ operators implement these set based capabilities. But they are simply masking (and delaying, due to lazy evaluation, which can be a nightmare for debugging) the iteration operations that are part of the implementation.

    Often times, code is much more clear and straight forward (or easier to debug) done as several steps, with explicit iteration.

    I would suggest using set based operations when they are more clear and better document intention than the alternatives–but that is the standard I use for every programming construct, prior to the (rarely needed) requirement to optimize. Debugging and maintenance time over the life of the project are the scarcest, most expensive programmer resources and the ones that should be optimized.

  2. Oded Coster April 12, 2014 / 09:03

    You seem to have missed that this post is about SQL and relational databases. In this context, set based thinking very much “makes universal sense”.

Leave a Reply

Your email address will not be published. Required fields are marked *