Posted August 27th 2009
GTH E1/T1 modules are always controlled by a general-purpose server, usually some sort of unix machine. The server and GTH are connected by ethernet and communicate using TCP sockets. Normally, that ethernet connection is chosen to be simple and reliable, for instance by putting the server and the GTH in the same rack, connected to the same ethernet switch.
I experimented a bit to see what happens when that network gets interrupted. I interrupted the network in a reproduceable way by disabling and re-enabling the server's ethernet port for a known length of time while running a <recorder>. (A <recorder> sends all the data, typically someone talking, from an E1 timeslot to the server over a TCP socket, 8000 octets per second.)
Here's what I did to capture traffic and interrupt the ethernet:
tcpdump -w /tmp/capture.pcap -s 0 not port 22 sudo ifconfig eth0 down; sleep 5; sudo ifconfig eth0 up
The GTH buffers about two seconds of timeslot traffic. So a 'sleep' of about a second won't result in an overrun. Here's what it looks like in wireshark:
Packet | Time | Direction | Flags | Seq. # |
133 | 7.596 | GTH -> server | [PSH, ACK] | 59393 |
134 | 7.633 | server -> GTH | [ACK] | 1 |
135 | 7.724 | GTH -> server | [PSH, ACK] | 60417 |
136 | 7.761 | server -> GTH | [ACK] | 1 |
137 | 7.852 | GTH -> server | [PSH, ACK] | 61441 |
138 | 7.889 | server -> GTH | [ACK] | 1 |
139 | 7.980 | GTH -> server | [PSH, ACK] | 62465 |
140 | 8.017 | server -> GTH | [ACK] | 1 |
141 | 8.108 | GTH -> server | [PSH, ACK] | 63489 |
142 | 8.145 | server -> GTH | [ACK] | 1 |
143 | 8.236 | GTH -> server | [PSH, ACK] | 64513 |
144 | 8.273 | server -> GTH | [ACK] | 1 |
145 | 8.364 | GTH -> server | [PSH, ACK] | 65537 |
146 | 8.401 | server -> GTH | [ACK] | 1 |
147 | 10.151 | GTH -> server | [PSH, ACK] | 66561 |
148 | 10.151 | server -> GTH | [ACK] | 1 |
149 | 10.151 | GTH -> server | [ACK] | 67585 |
150 | 10.151 | server -> GTH | [ACK] | 1 |
Everything up to packet 146 is normal: the GTH (172.16.2.5) sends 8000 octets every second and the server (172.16.2.1) acks them. It happens to be in chunks of 1024 octets about eight times per second. After packet 146, about 8.4 seconds after the capture started, the ethernet interface went down and stayed down for 1s. The TCP stream started up again after about 1.5s and then 'caught up' by sending many packets in quick succession.
I took a second trace similar to the first one, except this time, I disabled ethernet for about five seconds:
Packet Time Source IP Dest IP SPort DPort ---------------------------------------------------------------------- 28 1.040083 172.16.2.5 -> 172.16.2.1 54271 > 45195 [PSH, ACK] Seq=7169 29 1.040095 172.16.2.1 -> 172.16.2.5 45195 > 54271 [ACK] Seq=1 30 1.168065 172.16.2.5 -> 172.16.2.1 54271 > 45195 [PSH, ACK] Seq=8193 31 1.168078 172.16.2.1 -> 172.16.2.5 45195 > 54271 [ACK] Seq=1 32 1.296067 172.16.2.5 -> 172.16.2.1 54271 > 45195 [PSH, ACK] Seq=9217 33 1.296079 172.16.2.1 -> 172.16.2.5 45195 > 54271 [ACK] Seq=1 34 1.424068 172.16.2.5 -> 172.16.2.1 54271 > 45195 [PSH, ACK] Seq=10241 35 1.424081 172.16.2.1 -> 172.16.2.5 45195 > 54271 [ACK] Seq=1 36 7.782851 172.16.2.5 -> 172.16.2.1 54271 > 45195 [PSH, ACK] Seq=11265 37 7.782863 172.16.2.1 -> 172.16.2.5 45195 > 54271 [ACK] Seq=1 38 7.783406 172.16.2.5 -> 172.16.2.1 54271 > 45195 [ACK] Seq=12289 39 7.783413 172.16.2.1 -> 172.16.2.5 45195 > 54271 [ACK] Seq=1 40 7.783569 172.16.2.5 -> 172.16.2.1 54271 > 45195 [ACK] Seq=13737 ... 50 7.784962 172.16.2.5 -> 172.16.2.1 54271 > 45195 [FIN, PSH, ACK] Seq=23873 51 7.784972 172.16.2.1 -> 172.16.2.5 45195 > 54271 [ACK] Seq=1 52 7.785026 172.16.2.1 -> 172.16.2.5 45195 > 54271 [FIN, ACK] Seq=1 53 7.785348 172.16.2.5 -> 172.16.2.1 54271 > 45195 [ACK] Seq=25322
Everything is normal up to packet 35. Then, ethernet is suspended for five seconds and TCP takes a further second to recover, which causes a buffer overrun on the GTH (172.16.2.5). The GTH closes the socket at packet 50 and also sends an overrun event to the application so that it knows why the socket was closed.
GTH uses IP for control and traffic. It is important that the IP link between the GTH and the server is simple and reliable. Ideally the GTH and server should be in the same rack and be connected by an ethernet switch.
It's possible for a system to survive a short interruption (less than a second) to the ethernet traffic without pre-recorded calls getting interrupted. For longer interruptions, all bets are off.
(Interruptions aren't the only type of network problem, e.g. radio networks such as 802.11 can suffer significant packet loss, which can trigger TCP congestion avoidance. But that's another topic.)
Permalink | Tags: GTH, questions-from-customers