[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug report: VegasTcpAgent
Problem:
Under certain circumstances, TCP Vegas may set its cwnd_ (congestion
window) to 0 and never transmit any packets again, even though there are
packets waiting to be sent and the network is idle.
Problem source and potential solutions:
tcp-vegas.cc, line 137 "cwnd_ = v_newcwnd_". It's meant to restore cwnd_
to its value before it was inflated due to duplicate ACKs. Obviously
VegasTcpAgent should check if cwnd_ has indeed been inflated before doing
this. Problem is, the check (line 136) "if(dupacks_ > NUMDUPACKS &&
cwnd_ > v_newcwnd_)" is *not enough*.
Here's a case where things go wrong:
1. Duplicate ACKs are received, and "v_worried_" is set to 2 (line 305).
2. Before any new ACKs arrives, a timeout occurs, and therefore
"v_newcwnd_" is set to 0 (line 379).
3. A new ACK arrives, and it's determined that an outstanding packet has
"expired" (line 281; note that v_worried_ > 0) and therefore dupacks_ =
NUMDUPACKS (= 3).
4. A duplicate ACK arrives, so ++dupcaks_ (line 290). "v_newcwnd_ =
double(win)" is not executed since we are in "CWND_ACTION_TIMEOUT"
state and hence v_newcwnd_ stays 0.
5. A new ACK arrives and cwnd_ is set to v_newcwnd_ (line 137), which is
0. The TCP sender drops dead.
I got this case using an error model on a link.
Potential solutions:
1. When timeout occurs, set v_worried_ to 0. It seems to make no sense to
"keep worried" in common cases, since usually all outstanding packets
will be retransmitted anyway after a timeout.
However, the problem may still occur if there are more than 3 duplicate
ACKs right after a timeout and therefore v_newcwnd_ stays 0 and dupacks_
becomes > 3.
2. Inflate cwnd_ even though the test on lines 299, 300 fails. This
closes a loophole in trying to guarantee that if dupacks_ > 3, cwnd_
must have been inflated. I am not sure if there are other loopholes.
In fact, RenoTcpAgent does this in similar cases (tcp-reno.cc, line 88,
whether dupack_action() decides to "slow down" or not). I've tried to
fix it this way but haven't come up with a "clean" code that I'm happy
with.
3. Another curious problem is the test "if(dupacks_ > NUMDUPACKS && cwnd_
> v_newcwnd_)" itself (line 136). I think it should be "if(dupacks_ >=
NUMDUPACKS && cwnd_ > v_newcwnd_)". In "normal" cases (i.e., the
problems I mentioned above *not included*), dupacks_ == 3 means cwnd_
has been inflated, and I don't see why it shouldn't be restored to its
pre-inflation value in this particular case.
A quick look at Brakmo's TCP Vegas implementation (on x-kernel;
ftp://ftp.cs.arizona.edu/xkernel/new-protocols/Vegas.tar.Z) shows that it
has the same problems of lingering "v_worried_" (1. above) and not
"deflating" cwnd_ in some cases (3. above), so I'm not sure if they are
bugs or features of TCP Vegas. The x-kernel implementation may not have
the "drop dead" syndrome because it doesn't set v_newcwnd_ to 0 when a
timeout occurs. However, it may still set a "wrong" cwnd_ value due to
the "v_worried_" problem.
In any case, I think we can't accept the "drop dead" phenomenon as a
feature of VegasTcpAgent. If anyone knows of a newer TCP Vegas
implementation (than the x-kernel version mentioned above) that has
addressed these three problems, please let us (me and other interested
people) know. Thanks a lot!
========================================
Kuang-Yeh Wang [email protected]
University of Maryland at College Park
Department of Computer Science
========================================