[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ns] Possible bug in ns multicast



Hi all.

I think I have found a bug - or at least something very strange  - in
the code for the multicast DM routing protocol. I used the 2.1b6 release

on Intel Linux SuSE 6.0 (kernel 2.0.36). This bug is responsible for the

non-propagation of grafts from a node to the upstream node in the
multicast tree under certain particular conditions. I didn't find any
answer in the mailing list archives.

Consider the following scenario. You have a node N with two outgoing
links. The first link leads to a multicast receiver and the other to a
TCP sink - which prunes node N each time he receives multicast paquets.

                                   Multicast receiver
                                  /
                                 /
 Upstream - - - N
  node                       \
                                  \
                                   TCP receiver

Suppose the multicast receiver decides to leave the group. He prunes
node N and the prune propagates up the mutlicast tree. Soon after the
multicast receiver joins the group again and sends a graft. While the
graft is on his way to node N, the prune timeout for the link leading to

the TCP receiver expires so that multicast paquets will be forwaded
until node N is pruned again by the TCP receiver. When the graft arrives

at node N just after, you can see that the graft is not propagated
upstream although node N is not part of the muticast tree anymore at
this moment because of the previous pruning. So the multicast receiver
does not receive anything before the prune timeout for his own link
expires.

Here is what I suspect. The condition for the graft to be propagated is
that node N is not the multicast source (which is not the case here) and

that there are no active replicators at node N for that group and source

(see "DM instproc recv-graft" in tcl/mcast/DM.tcl). Before the TCP prune

timeout for the TCP receiver, this condition is verified but the effect
of the timeout is to insert the corresponding outgoing interface in the
list of replicators at the node (see "DM instproc timeoutPrune" in
tcl/mcast/DM.tcl). If I understand what appends then correctly, the
condition for the graft propagation becomes false (see among others the
instproc's init, insert, enable and is-active of class
"Classifier/Replicator/Demuxer" in tcl/mcast/ns-mcast.tcl : nactice_
becomes > 0). My conclusion is that the condition used for graft
propagation in the code is not consistent with the action taken during a

prune timeout.

I hope my explanation is correct. Anyway, there is a problem with the
graft propagation in this special case so I'd like to have other
opinions about this strange behaviour or a confirmation of the problem.
If needed, I can send my simulations scripts and traces but I think a
simple look at the code is sufficient to point out the problem.

Nicolas BONMARIAGE.

Student - University of Li�ge - Belgium.