Sean Corfield recently posted an entry about recursionin the database. This got me thinking. I once had a teacher who was adamantly against recursion for performance reasons, saying that Towers of Hanaoi was the only example where recursion was beneficial from a performance standpoint.
I have been unable to prove him wrong; can anyone else do so?
As such, I've always been cautious of recursion.
I recently spoke to a client about an application with a similar structure (to what Sean describes in his post). It was storing part information in a self-referencing table. (One part can be made to build other parts, etc.. ). It turned out that this application was a drag, because in some cases it was recursing up to 300 times to get all the data in the 'recursive tree'. This is a situation where recursion was just not working. We spoke, and I suggested this type of solution.
For simplicity, I'll use a table like this:
PKID is the primary key. Data is whatever data we are storing (whether it is part numbers or cat pedigree) RecurseOrder is a text string
For each element in the table, you give it a 3-character identifier (In my example, I'll use 3 characters, but the identifier can actually be 4 or more). The RecurseOrder field contains 'Parent's identifier' + 'Current identifier'
So, to put some data in this table:
| 1 | Root 1 | 001 | |
| 2 | Root 2 | 002 | |
| 3 | Root 3 | 003 | |
| 4 | Child 1.1 | 001001 | |
| 5 | Child 1.2 | 001002 | |
| 6 | Child 1.3 | 001003 | |
| 7 | Child 1.4 | 001004 | |
| 8 | Child 2.1 | 002001 | |
| 9 | Child 2.2 | 002002 | |
| 10 | Child 2.3 | 002003 | |
| 11 | Child 2.3.1 | 002003001 | |
| 12 | Child 2.3.2 | 002003002 | |
| 13 | Child 2.3.2.1 | 002003002001 | |
| 14 | Child 2.3.2.2 | 002003002002 | |
| 15 | Child 2.2.1 | 002002001 | |
| 16 | Child 2.2.2 | 002002002 |
The root is assigned '001', '002', '003'. The children of root 1 are assigned '001001','001002', '001003'. The children of child 1 are assigned '001001001', '001001002', '001001003', and so on.
Given any root, you can get the tree in recursive order just by sorting on the recurseorder column. To get the recurseorder on root 2, for example, you can do this:
2where recurseorder like '002%'
3order by recurseorder
To get the tree, starting at the third child of root 2, you can do this:
2where recurseorder like '002003%'
3order by recurseorder
In some cases, such as discussion groups (where you want to retrieve a whole thread in post order) I have found this to be an elegant solution.
With a three-character identifier, you are limited to 1000 elements before you run out of thread strings. But you use a larger identifier string if you need more.

#1 by jM on 4/6/07 - 5:39 PM
#2 by Jeffry Houser on 4/7/07 - 10:58 AM
I have to admit, I don't completely grok cascading updates. I googled it, but couldn't find a definition. If you could elaborate specifics on what you're trying to do...
#3 by jM on 4/7/07 - 12:16 PM
I don't know if the concept is always called "cascading update" but I was referring to Peter Bell's comment about "if you change the relative order towards the top of the tree you have to cascading update a bunch of other values to keep everything in synch". My understanding of that comment is: if you were to insert a new child after
9 Child 2.2 002002
the recurseOrder for the new child would become 002003. This would in turn affect the existing recurseOrder of
10 Child 2.3 002003
and its children. Their recurseOrders would become
10 Child 2.3 002004
11 Child 2.3.1 002004001
..
So the change caused by inserting a new child (etc) must be cascaded through other records to keep the recurseOrders consistent.
My original question was how are you applying those changes: using triggers, a sequence of update statements, stored procedures .. ?
Hopefully that clarified things :)
#4 by Jeffry Houser on 4/7/07 - 1:30 PM
With the approach I describe in this post, cascading updates is not an issue because we are always something at the end of the list.
You never insert a new child between other children. It is always after the largest child.
This would be great for something like a threaded discussion list. You're either creating a new message or a response to one. New messages are done at the largest version of the top level, '001' or '002'. Responses are done at the largest version of the main message.
So, if '002' has 5 responses, the new response would be 002006.
If you are creating a response to 002003, it would be come 002003001
And the second reply would become 02003002
etc..
So, if you need to keep a tree of some sort that needs constant updating of structure, I'm not sure the approach I describe here is the best one.
It might be possible that you could use the recurseOrder field for the hierarchy and a separate "SortOrder" field for ordering the display of the top level element. I haven't experimented with this, though.
Definitely read through the post I linked to regarding The Celko model and Adjancy lists:
http://www.intelligententerprise.com/001020/celko....
It should give you a good idea on how to deal with these type of updates using alternate approaches.
#5 by jM on 4/7/07 - 2:27 PM
>structure, I'm not sure the approach I describe here is the best one.
I'm not sure either, but it sounded like thats what Peter is doing (if I read his response correctly) The approach you described is similar to nested sets in the sense that you must cascade structural changes. And while I've read about nested sets, I haven't used them in a real application. So I was curious about how Peter implemented the updates.
>This would be great for something like a threaded discussion list.
Yes, I'll have to keep it mind for future apps. Thanks for writing it up :)
#6 by Wesley Acheson on 2/25/13 - 9:23 AM
The wikipedia article on it is actually very good at explaining the concepts.
http://en.wikipedia.org/wiki/Nested_set_model