[NTISP] Very strange problem

David V. Brenner ( (no email) )
Wed, 19 May 1999 16:19:08 -0700

Hi, folks. This isn't an NT-specific issue, but we are an NT-based ISP and
we need all the help we can get. <g>

Not too long ago, we began the process of migrating to a new upstream
provider. We were assigned four class c address blocks. Two were
contiguous; the others weren't.

Currently, we have two POPs. The main POP has two T1s (not bonded) being
handled by two Adtran TSUs and an ACC Amazon router with dual WAN ports.
The first T1 is used exclusively for our dial-up ports at that location.
One of the contiguous class c'sis assigned to that circuit and that's all
the circuit does.

The other circuit is what we use for our LAN, our domain hosting servers,
mail and frame relay customers. The secondary POP's PVC is tied into this
circuit as well.

The second class c is split in half with the first half assigned to our
local resources (other than terminal servers) and the second half assigned
to the other POP. As such, we are using addresses in the 216.132.67.0/25
range, while the other POP is using addresses in the 216.132.67.128/25
range. Of course, all devices connected to either network are using
netmasks of 255.255.255.128. The secondary POP has similar equipment to the
main POP, with the exception of its router, which is a Compatible Systems
MicroRouter 1200i.

The third class c is split into two /27 (32 addresses) subnets and one /26
(64 addresses) subnet. The remaining /25 is unused, as is the fourth class
c. The two /27 subnets are both using Compatible Systems MicroRouter 900i
routers. One is using an Adtran DSU 5600, while the other is using a BAT
Electronics unit with the same capabilities (56K/64K). The /26 subnet is
using some kind of router/firewall combo from Nokia. (they supplied it)

It is important to note that, prior to this switch, the secondary POP had
been sharing a class c space with the company who is now set up as a /26.
At the time, both entities were /25 subnets and everything worked fine.

The problem is that, with the exception of the main POP, certain sites are
inaccesible to all of these other folks. For instance, neither www.espn.com
nor www.hotmail.com works for the dial-up customers -or- those connected via
LANs to any of the other circuits, although they load fine for anyone on our
main POP or any of the machines on our LAN (whose address space is in the
same class c as the secondary POP, only subnetted using 255.255.255.128).
Even more bizarre, if we dial into the other facility (which is across the
river in another state) using one of our test machines, it can indeed bring
up both www.hotmail.com and www.espn.com. To further add to the confusion,
an FTP server connected to the secondary POP is exhibiting weird behavior in
that *anyone* can download from it, but when someone from the main POP tries
to upload to it, the process stalls at about 25%.

Now, more specifically, with regard to the hotmail problem, understand that
the hotmail servers (there are three, I believe, with IP address handed out
round-robin style by Microsoft's DNS) *can* be contacted, and some
"conversing" does take place. However, the actual return trip is what
doesn't take place. In fact, using LYNX (the text mode Unix browser) on a
Linux box at the Secondary POP, shows clearly that at least a small data is
passing back and forth before the process freezes up with one of the
following two messages:

"HTTP request sent; waiting for response"
"HTTP/1.1 200 OK"

So, in one instance, the data never comes. In another, only the very first
bit of the header of the page shows up. In either case, this all happens
very quickly -- and then dies.

Also, here is what I get if I redirect the entire session to a text file:

ESC[1;24rESC(BESC)0ESC[m^OESC[1;24rESC[HESC[JESC[22;1HESC[7mGetting
http://www.h
otmail.com/^MLooking up www.hotmail.com.ESC[m^OESC[K^MESC[7mMaking HTTP
connecti
on to www.hotmail.com.^MSending HTTP request.ESC[m^OESC[K^MESC[7mHTTP
request se
nt; waiting for response.^MRead 428 bytes of
data.ESC[m^OESC[K^MESC[7mHTTP/1.1 3
02 FoundESC[m^OESC[K^MESC[7mUsing
http://wy1lg.hotmail.com/cgi-bin/login^MGettin
g http://wy1lg.hotmail.com/cgi-bin/login^MLooking up
wy1lg.hotmail.com.ESC[m^O
ESC[K^MESC[7mMaking HTTP connection to wy1lg.hotmail.com.^MSending HTTP
request.
ESC[m^OESC[K^MESC[7mHTTP request sent; waiting for response.

I should mention that, prior to our upstream provider switch, we had been
using an ACC Nile, which is pretty much the little brother to the Amazon.
At that time, the frame relay customers were given address blocks that we
would request from our old upstream provider. The only exception was the
space used by the secondary POP and the folks with the the Nokia
router/firewall. Those two shared a whole class c in the same way that the
main and secondaruy POPs share a class c now. I would think that we would
be seeing much more widespread problems if we had misconfigured something.
After all, it's just a handful of sites that don't work.

We have checked with both Compatible Systems and ACC. Both insist that the
subnetting scheme is okay and that we should check our netmasks on every
device on our networks. Well, we've done that and cannot find a hole in our
setup anywhere.

Does anyone have any other ideas? Frankly, we're stumped.

Thanks in advance.

___________________________________
David V. Brenner - dvb@cport.com
International Services Network Corporation
http://www.cport.com