Recursion isn't the problem. Not keeping track of seen tweets is the problem. Re...

eyelidlessness · on Nov 30, 2020

But you wouldn’t do that if your assumption is that the graph doesn’t cycle.

Edit: child comments are correct and I regret my oversight

ulber · on Nov 30, 2020

For traversing a DAG you probably still would to avoid exploring an exponential number of paths (consider a chain of diamonds [1]).

1: https://www.researchgate.net/figure/A-diamond-shaped-DAG_fig...

tyrust · on Nov 30, 2020

But a diamond cannot occur in a Twitter-reply graph, right? It would require a Tweet to be able to reply to more than one tweet.

anonytrary · on Nov 30, 2020

It can because the assumption is that we are crawling embedded links as well as native parents.

delecti · on Dec 1, 2020

You can reply to one tweet while quote-retweeting another. Depending on how links are defined that could result in a diamond.

zemnmez · on Nov 30, 2020

It’s a lot easier to just have a depth limit on such non-cyclic graphs than keep an in-memory list of previously seen nodes. its interesting for sure! but a much rarer edge-case imo

anonytrary · on Nov 30, 2020

Graph doesn't need to cycle to hit a node twice. Good code does not visit the same node twice.

patates · on Dec 1, 2020

When doing recursion (Postgres recursive CTE) I keep a path on the latest edges and check to see new edges aren't already visited, so same nodes can appear in multiple branches but not on the same branch. Works flawlessly.

pfarrell · on Dec 1, 2020

Can you provide an example of this? I’d like to understand this technique.

Edit: is this an example? https://stackoverflow.com/a/1757915

dheera · on Nov 30, 2020

Irritation would probably better in terms of not blowing up your memory requirements. Just keep hashes of all visited nodes and a stack or queue of to-be-visited nodes and loop until you have no more to-be-visited nodes.

waterproof · on Dec 1, 2020

I think you meant "iteration", not "irritation".