Add a character, or expand to non-printable?

  • I encourage you to study Vadim Tropashko's ideas in this area. Although I ran into some limitations for my application that rendered the solution unsatisfactory for my problem, it may fit yours. Plus, it's a very interesting way of looking at the problem: http://arxiv.org/ftp/cs/papers/0402/0402051.pdf

    Wow, that was a technical article and I liked it. That's a way cool way to encode and store a tree. I had to read it several times over the weekend to understand what it was doing, then a few more times to figure out how it was doing it. It looks like a very nice, compact way to encode a complete and unique path. The one part I haven't figured out yet (a few more readings to go, or maybe this is the limitation you ran into as well) is if there is a way to figure out part of the path from the values without decoding the whole path. The most common query for my data is going to be connecting parents and children to a specific depth (e.g. all children, grandchildren and great-grandchildren for a particular record, but nothing lower; all parents, grandparents, etc, for a particular record all the way to the top of the tree). I haven't figured out how to run that type of query using this methodology without unpacking all the trees first.

  • Hi Chad;

    I ran into a couple of things with the matrix encoding:

    1) The potential sizes of the trees was limited by the encoding of the matrix cells as integers. Even going with a bigint datatype would not have predictably allowed enough room to represent the large trees I needed to. Depending on the depth, though, it should be able to handle trees with nodes numbering in the low thousands.

    2) This was more an annoyance at my own inability to solve the problem, but it looks like maybe what you're running into, too. That is the inability to derive the depth of the node by only inspecting the matrix encoding values. In my experiments, I needed to calculate and store the depth "manually". That's not to say there's not a way to do it -- I just couldn't figure it out if there is. The answer may lie in the techniques used for pruning and grafting using this model. I know Tropashko has other papers available, but here's a quick link to start: http://www.dbazine.com/oracle/or-articles/tropashko5

    HTH,

    TroyK

  • One problem I see with all the different types of continued fractions encodings is relocating subtrees.

    Both Vadim Tropashko and Dan Hazel provide ways to recalculate the new encodings of moved nodes.

    But since continued fractions directly correspond to Materialized Path, I assume it is obligatory to also close any gaps left by the relocation. That is, any next siblings (and their subtrees) of the root of the moved subtree need to be recalculated as well.

    (If we move the subtree at 3.12.5.1.21 to some other place then 3.12.5.1.22 must be moved to the place formerly occupied by 3.12.5.1.21.)

    I expect that performance and concurrency will suffer depending on the number of nodes affected by the relocation.

    Has any of you considered the Transitive Closure approach? It seems to minimize updates at the expense of (a lot of) additional storage.

  • That could present a problem. In my case, the trees are more deep than bushy (typically), but a change near the top would result in a lot of change downstream. Fortunately, changes near the top are rare due to the nature of the tree content itself. I chose to leave the holes from a move or a removal, and the creation algorithm has new members fill them in later. Since the node values don't have any intrensic meaning, I can also rebuild (defrag) the tree at any time without consequence. Of course, this is specific to my implementation and only works because of the business rules I have in place, so I'll call myself lucky. YMMV.

    Chad

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply