Thursday, April 23, 2015

port everglades site visit

[Summary: the Port Everglades CREWS station is back online and its feeds have been reenabled.  As of this writing I expect the problem has been resolved, at the cost of no data being captured between 200 UTC April 20 2015 and 1900 UTC April 23 2015.  A more detailed narrative of the problem and fix follows.]

This morning I met Jack Stamates and Natchanon Amornthammarong (Mana) at Port Everglades for a coordinated visit to the Naval base that hosts our Port Everglades station.  They wanted to scout out possible sites for deploying Mana's ammonia sensor, and since Jack was already coordinating with the Navy for permission to access the site, I decided to tag along to work on the CREWS station.

The reason I wanted to visit was this (described here): at 200 UTC on April 20, 2015, the PVGF1 station had stopped producing data.  The station's cellular connection was working fine and it was still possible to connect to the datalogger directly.  The problem appeared to be with the datalogger's clock: over multiple connections and attempts to reset the clock, it would spontaneously revert to garbage timestamps, once in 1930 and one in 2066.  The logger's measurement operations appeared to be working normally but when it tried to write those measurements in its data tables it seems those invalid dates caused errors and no data could be saved.

On Tuesday I tried resetting the clock to the correct UTC time, but it either reverted to garbage timestamps immediately or within 24 hours.  I tried uploading variations on the logger program, theorizing that perhaps there was a problem with the memory module or memory card, but they did not help.  Rather than continue remote troubleshooting I decided to swap out the datalogger in person, figuring that this would be a brute-force solution.  I also disabled feeds of data from this station in case those 1930/2066 timestamps somehow got reported.

Since I'd be visiting the site in person, I decided to replace the GOES transmitter and WXT (the 'weather transmitter' by Vaisala) while I was there.

This site uses one of our ancient SAT-HDR-GOES transmitters, although with a direct cellular modem connection available we do not depend on GOES for data transmission.  Still, it's nice to have redundant streams of data from the same site and the GOES transmitter at PVGF1 hasn't worked properly for at least five months.  I wasn't sure if any of our remaining SAT-HDR-GOES transmitters still worked but I chose the most likely candidate for swapping, figuring at worst we'd still have only the cellular data feed, which is extremely reliable.

The WXTs at our regular ocean-based sites are swapped out annually and often they fail, at least the acoustic wind sensors do, before their year is out.  I think that's mostly because of the very large boobies that perch on our ocean-based stations, most particularly the one at La Parguera, PR.  But the last WXT at Port Everglades was deployed for almost 2.5 years with no apparent data degradation.  Its successor WXT has only been deployed for 13 months, but since we'd be on-site and it was nominally past our usual yearlong deployment time I decided to swap the WXT as well, as a low-priority task if everything else went well.

And everything did go well.  Despite a little bit of rain in the area I was able to swap the logger, transmitter and WXT.  Jack and Mana stuck around until I was done and kindly spotted me on the ladder when I swapped the WXT.  I connected to the logger by laptop before leaving to make sure all of the sensors were connected properly, and everything seemed okay, including the logger clock.

While I was connected on-site I had my first real hint of what may have caused the problem.  In looking over the variables in logger memory I was reminded that the GOES-transmitting stations actually reset their logger clocks using GPS-sourced time, once per day, as a way to avoid clock drift.  So for the first time it occurred to me that the problem might not be a logger clock failure, but rather a mangled GPS time from the transmitter.  Still, I was replacing both logger and transmitter so it didn't seem important which one had caused this problem.

I packed everything up and, after stopping briefly at home to check whether the cellular modem was still connecting (it was), returned to the lab.  Once here I connected to the station again and was surprised and a little dismayed to find that the new logger's clock was now set to a date in 1930.  So clearly this wasn't a simple case of equipment failure as I'd assumed.

What I think now is there might be some kind of date-overflow in the SAT-HDR-GOES transmitter's internal date representation, that might potentially be affected both of our transmitters (and maybe all of them).  The SAT-HDR-GOES has long since fallen out of support, and actually so has its replacement transmitter, the TX312.  We used this transmitter model at Port Everglades originally as a cost-saving measure (rather than pay for new transmitters).  It's possible that this transmitter model, which was never expected to still be operational in 2015, has lost the ability to report GPS timestamps accurately.  I theorize that resetting the logger clock with all zeroes led to the dates in 1930, and perhaps setting the clock with all-fill times could produce the dates in 2066 that I saw.

The situation now is that I've disabled the clock-reset code in the datalogger program, and instead I've manually set the time at this station based on our loggernet server time.  I have reenabled the feeds to NDBC, our CHAMP database, the CHAMP Portal, and the G2 Ecoforecasting system.  We'll be monitoring this station's performance extra-closely in the coming days but (1) the problem seems to have been fixed and (2) as a bonus we have deployed a fresh WXT that need not be swapped again for at least another year.





(signed)
Mike Jankulak

Tuesday, April 21, 2015

port ev logger problems

[This is the text of an email that I wrote Tuesday morning, April 21, 2015.  This blog post will be backdated to the date and time of that email.]

As it happens I received notice from NDBC this morning that they have not received any new Port Everglades data from us in over 24 hours, and there are backlogs of cronjob errors and a download alert in my mailbox today as well.

As near as I can tell, it looks like a problem with the datalogger's internal clock.  I am seeing timestamps in the year 2066 and a few times when I've connected the internal clock has been reset to 1930.  I have reset the logger clock a few times to the correct time (UTC).  The first few times it got corrupted again almost right away.  At present it has held on to the correct time for about ten minutes but I don't trust it.

[These bad timestamps are causing the logger to skip its datatables.  Essentially I think its measurement actions are working just fine but it balks when asked to record those measurements with out-of-bounds timestamps.]

Anyhow I have disabled all of our PVGF1 feeds for the present.  In the short term it may be possible to reenable them by day's end if the clock seems to be holding steady.  In the long term we will probably want to replace that logger.

I know Jack and Mana might be visiting that site soon so depending on their timing I might suggest that I tag along, if that's alright with everyone.

(signed)
Mike Jankulak