Cosmic Cheering Resumes
Cosmic Cheering ResumesFor 15 nail-biting minutes, everyone from NASA Administrator Sean OKeefe to the most junior member of the Mars Exploration Rover project waited for the signal indicating the lander was still alive. The cheering resumed when the signal came.
Spirits outage weeks after that brief silence left engineers guessing. JPL engineers could keep sending new commands to the rover, but they had no way of knowing whether it was listening or whether new instructions might do more harm than good.
Often, a communications breakdown spells the end of a mission, as appears to be the case for the European Space Agencys Beagle 2 Mars lander, which failed to radio back after touchdown in December. On the other hand, the rovers are designed to be as autonomous and resilient as possible, meaning that they will try to debug their own problems and radio diagnostic information back to Earth even if they are not receiving commands.
For Joseph Wackley, the Deep Space Networks mission system operations manager, the silence meant he had to fortify the network to make sure Spirits signal would not be missed when-or if-it came. NASA brought on additional staff and powered up onsite generators to guarantee that antennae would be up and running.
"Thats exactly the nightmare," says Wackley, who was at a Pasadena facility of subcontractor ITT Industries the night Spirit landed. "We have to make sure its not us contributing to why they are not seeing the signals."
Check out eWEEK.coms Server and Networking Center at http://networking.eweek.com for the latest news, views and analysis on server hardware and networking technolgies. When trying to regain a connection, the Deep Space Network puts its reception equipment in a "closed-loop" mode, continually scanning a range of frequencies around the expected one, looking for some kind of signal that it could "lock on" to. The closed-loop mode kicked in for the 15 silent minutes during Spirits landing and again in the latest communications crisis.
JPL finally caught a break Jan. 23. After several rounds of sending instructions that were not acknowledged, JPL received a transmission Spirit sent on its own initiative. But engineers still had trouble getting Spirit to respond to commands or send back intelligible data. One communications session relayed via the Surveyor orbitor picked up static, as if the UHF antenna had been left on but wasnt controlled by Spirits computer.
Gradually, JPL was able to rebuild the communications link through trial and error.
Where project manager Pete Theisinger originally told a press conference some electrical or mechanical failure was suspected, the investigation subsequently indicated a software-only problem.
It turned out the rover had become trapped in a cycle of continual reboots, crashing each time it tried to access the two flash-memory devices it uses for storage of images and other data. The cycle of some 60 reboots over 30 hours also prevented Spirit from going to sleep overnight when no solar power was available, causing it to run down its batteries.
To get the robots software to work normally, JPL had to disable the flash-memory devices so the onboard computer would boot using only Random Access Memory (RAM), which stores information for active use by a computer only when power is present.
But this left Spirit operating in a crippled state, since data held in RAM evaporates when a computer is powered down. Like a digital camera, the rover uses flash memory for temporary storage of data to be sent to Earth later. But apparently the scientists had hoarded data too aggressively, filling flash storage with data collected during the cruise between Earth and Mars as well as data from the surface exploration. Eventually, just keeping track of all those files consumed so much memory that Spirits software was unable to function normally.
Next Page: Solving Spirits crippled state of operations.