Main Page Content
An Introduction To Generating Sequences With Sql
Writing code to create a sequence is one of the first things taught in programming school.
Remember this code from your early Basic programming bootcamp?
For i = 1 to 10
Print i
Next i
While loops (no pun intended) seem to be quite good at creating sequences, we will soon see that they are actually quite inelegant and awkward.
Structured Query Language, a language oriented towards set-based operations, is ideally suited to build sequences quickly and efficently.
Are sequences really useful in SQL? Yes they are! Here are some areas where I have found sequences to be indispensable:
- to build calendar-type applications and leave-day/business-day calculations (tables with ranges of dates)
- sales fact tables (tables with ranges of numbers or dates)
- creating volume data for volume testing of applications (tables with ranges of numbers, dates or alpha-numeric values)
Enough, show me some SQL
For all the examples in this article I have stuck to common SQL syntax. In a couple of instances where I have used database specific syntax it should not be too hard to convert the syntax to any other database. I tested all the syntax on a SQL Server 2000 installation, but at the end of the article I have provided the sample examples in both MS-Access and Oracle.
We'll start by creating a very simple table, a table that stores the integers from 0 to 9. Let's call it 'tblInteger'. The table creation script could look like this:
Create Table tblInteger( anInteger int)
Now fill up the table with the numbers from 0 to 9.
insert into tblInteger values (0)insert into tblInteger values (1) insert into tblInteger values (2) ...insert into tblInteger values (9)
We'll be using this table extensively in the course of this article.
You've probably guessed by now the easiest way to generate a simple sequence of 10 numbers:
Select anInteger from tblInteger
The query will output:
anInteger |
---|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
And to generate a sequence of multiples of 10 from 0 to 90:
Select anInteger*10 as tens from tblInteger
tens |
---|
0 |
10 |
20 |
30 |
40 |
50 |
60 |
70 |
80 |
90 |
Duh? What's the big deal? A generated sequence of 10 numbers?
Agreed. Generating a sequence of 10 numbers was simple enough as all the numbers were hard coded in a table! But, when two sets (sequences) are combined using a cartesian product, an entirely new set is created.
Using a Cartesian Product
Now we will generate a much larger sequence of 100 numbers, by combining the two queries that we just saw.
Select tblZeroToNine.anInteger + (tblTensFromZeroToNinety.anInteger*10) as NumberSequence from tblInteger as tblZeroToNine , tblInteger as tblTensFromZeroToNinetyorder by tblZeroToNine.anInteger + (tblTensFromZeroToNinety.anInteger*10)
This will output a sequence of 100 numbers.
NumberSequence |
---|
0 |
1 |
2 |
3 |
... |
98 |
99 |
What exactly happened here?
We used a "cartesian product" to combine two sets of numbers --
one with a sequence of numbers from 0 to 9 and another with a staggered sequence of numbers from 0 to 90, to generate a completely new set of numbers.How does a cartesian product work? Revisiting some good old set theory, consider two very simple sets:
set of Men = { Peter, Michael }
set of Women = { Amanda, Jennifer }
The cartesian product of the set of Men and Women will first combine Peter with
Amanda and Jennifer, and then combine Michael with Amanda and Jennifer.So the new set will have:cartesian product = { (Peter Amanda), (Peter Jennifer), (Michael Amanda), (Michael Jennifer) }
Also note that in a cartesian product the sequence of elements is
important, which means "Women cartesian product Men" is not the same as "Men cartesian product Women", so (Peter Amanda) is not the same person as (Amanda Peter) -- not unless they had multiple personality disorder!In this SQL example we had two sets, one with 10 numbers from {0 to 9} (call that as set UNITS) and another set with 10 numbers in increments of 10 from {0 to 90} (set TENS). Every element in set UNITS is combined with every element in set TENS. So the 0 from set UNITS is combined with 0, 10, 20, ... 90 from set TENS, then the element 1 from set UNITS is again combined with all the elements in set TENS and so on. We combine the elements using the arithmetic "+".
Building more sophisticated sequences
With a bit of creativity we can extend this concept and build more sophisticated sequences.
To build a sequence of 10 odd numbers we can use this query:
Select (2*anInteger)+1 From tblInteger
The above query will output numbers: 1,3,5,7, ... 19. (Math note: to convert an integer value 'n' to an odd number, the formula is: 2*n + 1)
When combined with the query to build the sequence of the first 10 multiples of 10 we can create a sequence of the first 50 odd numbers:
Select distinct ( (2*tblFirst10oddNumbers.anInteger)+1 + (tblZeroToNinety.anInteger*10) ) as OddNumberSequence from tblInteger as tblFirst10oddNumbers , tblInteger as tblZeroToNinety where (2*tblFirst10oddNumbers.anInteger)+1 + (tblZeroToNinety.anInteger*10) < 100
We can also create sequences with dates. The following SQL query will return a
set containing the next hundred dates from the present date.Select Getdate() + units.anInteger + (tens.anInteger*10) as DateSequence from tblInteger as units , tblInteger as tens
In the above query, Getdate()
is a SQL Server function to return the current date; you can replace it with the "date()" function of the database of your choice.
While the world of HTML has been a fairly recent adopter of standards, the world of SQL has a long history of standards. SQL was adopted as an ISO standard back in 1987. Eventually the standard was revised in 1992, and that version was called SQL92 (now called SQL2). The SQL2 standard sets guidelines for conceptual SQL features.
Similar to the w3.org compliance levels, SQL2 also supports various levels of conformance -- entry level, intermediate and advanced. Most modern databases are typically measured against the SQL2 standard. The most recent SQL99 (or SQL3) standard focuses on object oriented databases.
If your database server supports some sophisticated syntax, you can easily build complex patterned sequences. The following query makes use of SQL Server's CASE()
syntax to generate a 1000 element long alpha-numeric sequence.
Note:
For those not familiar with SQL Server's Case()
statement, it is quite similar to the IIF()
function in Access or the decode()
function in oracle.
And before you get all worked up about the use of proprietary SQL constructs, SQL Server's Case()
syntax is SQL2 compliant.
Personal note: Now I am beginning to sound like one of those HTML compliance article writers.
Select Distinct anAlphabet + ( Case When tblZeroToNine.anInteger + (tblTensFromZeroToNinety.anInteger*10) < 10 Then '00'+ Cast(tblZeroToNine.anInteger + (tblTensFromZeroToNinety.anInteger*10) as char(1)) When (tblZeroToNine.anInteger + (tblTensFromZeroToNinety.anInteger*10) > 9 and tblZeroToNine.anInteger + (tblTensFromZeroToNinety.anInteger*10)<= 99) Then '0'+ Cast(tblZeroToNine.anInteger + (tblTensFromZeroToNinety.anInteger*10) as char(2)) Else '0000' End ) as alphaSeq from tblInteger as tblZeroToNine , tblInteger as tblTensFromZeroToNinety , tblAlphabetsorder by alphaSeq
Here, "tblAlphabets" is a table with the first ten letters of the english alphabet (from A to J), the table structure looks like this:
Create Table tblAlphabets ( anAlphabet char(1))
Endnotes
This was just an introduction. The type of usage, and the level of complexity of this technique is limited only by imagination.
I would recommend going through a good SQL manual, your favourite RDBMS's SQL manual should be a good starting point.