Monday, August 3, 2009

station goes offline, and is revived

On July 20th, 2009, I received an email from Rex Hervey (NDBC) about issues relating to data feeds (current and planned) from our St. Croix and Little Cayman CREWS stations.  As an afterthought Rex said:  "Also, we haven't received data from the Port Everglades station for a while now."

He wasn't the only one who'd noticed.  Tom Carsey on July 21st, 2009, sent his own trouble report:  "Evidently the Pt. Everglades site has not transmitted since July 12."

Following up on Rex's report (I was in Little Cayman helping install the new CREWS station at the time, and hadn't see Tom's message yet), I commented in email:
Rex at NDBC mentioned that our Port Everglades data feed went offline.  Checking the archives, it seems like it's been offline for about nine days.  It's difficult for me to investigate further from the field, but I thought I'd mention in case others wanted to look into it further.

The last transmission (that I see) was at 13:42 UTC on day 193 (Sunday morning, the 12th??).
Lew Gramer, who was the acting CHAMP sysadmin in my absence, followed up on July 21st, 2009:
Yes, the last transmission from the Port Everglades station that I see in the archives on our server is still the one on Sunday 12 July at 13:24 GMT. I checked the "usual" diagnostic data from the last day's transmissions (station and datalogger voltages, transmitter forward vs. reflected power, data and error counts, etc.), all seemed normal. Jack is out of town all week as well, but Shoe located that key and will travel to check the site out tomorrow... 
As Lew indicated, Mike Shoemaker (a/k/a "Shoe") traveled up to the Port Everglades site on the morning of July 22nd and was able to bring the station back online.  Lew sent out the following report about Mike's intervention later that afternoon:
Mike Shoemaker visited the Port Everglades monitoring station this morning, reset everything, opened the station door for a while and let it air out / cool off. The station has begun transmitting again: the 14:24 and 15:24 UT transmissions have been received, and were both well-formated. (Shoe, the last hourly transmission has reasonable looking salinity and sea temperature also.)

Shoe believes part of the reason for the sudden failure in early July may be that the enclosure holds too much heat. We checked, and the hourly mean "panel temperature" climbed above 36oC (97oF) or 50 of the hours when we got regular transmissions in late June and early July. Max PTemp was 38.8oC. (Shoe, went down from 32.7 last hour to 31.9 just now, so at least isn't getting worse.) Jack and Mike J., when you both get back, Shoe has ideas for how to fix this.
 Following Jack's next data-recovery visit on July 31st, I reviewed the patched data archives and was able to produce the following email summary of the incident on August 3rd, 2009:
There was NO interruption in datalogger activity in this period.  As far as I can tell from a brief look at the logger/transmitter diagnostics, the problem was specific to the transmitter.  Either the transmitter stopped working, or it merely stopped communicating with the datalogger.  Either way, Shoe's power-reset appears to have fixed the problem, although there's no guarantee that it won't happen again.