Re: NEWS server

Jeff Woods ( jeff@delta.com )
Wed, 19 Mar 1997 15:38:07 -0500

At 11:44 AM 3/19/97 -0800, you wrote:

>We are testing a full feed to INS, and its really starting to
>annoy me. Eveytime that machine has either crashed or rebooted,
>I must run a re-build of the hash tables before the news server will
>start.

This happens on any installation. The problem is the "bubble-type"
comparison that the hash tables are doing, comparing each record to each
other record to ensure no duplicates. As the number of files on the drives
increases, the length of time it takes to rebuild goes up exponentially.

I've seen this on INS, Netscape, DNews, and Netmanage IFS. It cannot be
avoided, it seems, when running news on NT. It happens to us about every
other month or so. Our solution has been to RENAME AUTOCHK.EXE so that it
doesn't auto-rebuild, leaving the cross-linked files present. EVENTUALLY
(a couple months MORE later) this will cause you to have to respool (delete
the spool partition and reformat it), but that's better than leaving your
news servers down for two days while it rebuilds a hash table of two
million entries.

In fact, we're right in the middle of doing that today -- it finally got to
the point that it wouldn't run, so we had to delete and respool. Since we
still have the ACTIVE file, no users have lost news -- they just can't
access it for the time it takes to delete the spool partition and reformat
it (and with RAID, 30 gigs formats in about 2 minutes), and they can't read
any articles that have already arrived.

Advice for this: Keep the SPOOL partition separate from EVERYTHING else.
That way you don't have to back anything up -- just delete the history
files, respool, and create a new history file. Voila!

>The real complaint, is that with a two gig (its expiring at
>48 hours) spool, it takes 8 hours to rebuild the hash.

Lucky you -- we expire at 10 days, over 3 million articles spooled at any
time, and it would take DAYS to rehash -- thus we just respool as needed.

Note that multiple news servers in a round robin MIGHT help this, but not
unless they were synchronized to have each article identified by the same
local file name (article number). Otherwise, if a user got one server
today, and a different one tomorrow, his pointers might not agree with the
other server the next day. <sigh>

>Now THATS
>annoying! We have DNEWs running for almost a year, and although its
>sometimes slow, it atleasy doesn't die and/or can recover instantly.

FWIW, Netmanage doesn't die often either. But every now and then (every
couple months) you get a dead drive, or a prolonged power outage, or
SOMETHING that will lock it up or send you to a BSoD. And it'll cause
data errors.