Cosmic Cheering Resumes

By David F. Carr  |  Posted 2004-02-06 Print this article Print

Cosmic Cheering Resumes
For 15 nail-biting minutes, everyone from NASA Administrator Sean OKeefe to the most junior member of the Mars Exploration Rover project waited for the signal indicating the lander was still alive. The cheering resumed when the signal came.

Spirits outage weeks after that brief silence left engineers guessing. JPL engineers could keep sending new commands to the rover, but they had no way of knowing whether it was listening or whether new instructions might do more harm than good.

Often, a communications breakdown spells the end of a mission, as appears to be the case for the European Space Agencys Beagle 2 Mars lander, which failed to radio back after touchdown in December. On the other hand, the rovers are designed to be as autonomous and resilient as possible, meaning that they will try to debug their own problems and radio diagnostic information back to Earth even if they are not receiving commands.

For Joseph Wackley, the Deep Space Networks mission system operations manager, the silence meant he had to fortify the network to make sure Spirits signal would not be missed when-or if-it came. NASA brought on additional staff and powered up onsite generators to guarantee that antennae would be up and running.

"Thats exactly the nightmare," says Wackley, who was at a Pasadena facility of subcontractor ITT Industries the night Spirit landed. "We have to make sure its not us contributing to why they are not seeing the signals."

Check out eWEEK.coms Server and Networking Center at for the latest news, views and analysis on server hardware and networking technolgies. When trying to regain a connection, the Deep Space Network puts its reception equipment in a "closed-loop" mode, continually scanning a range of frequencies around the expected one, looking for some kind of signal that it could "lock on" to. The closed-loop mode kicked in for the 15 silent minutes during Spirits landing and again in the latest communications crisis.

JPL finally caught a break Jan. 23. After several rounds of sending instructions that were not acknowledged, JPL received a transmission Spirit sent on its own initiative. But engineers still had trouble getting Spirit to respond to commands or send back intelligible data. One communications session relayed via the Surveyor orbitor picked up static, as if the UHF antenna had been left on but wasnt controlled by Spirits computer.

Gradually, JPL was able to rebuild the communications link through trial and error.

Where project manager Pete Theisinger originally told a press conference some electrical or mechanical failure was suspected, the investigation subsequently indicated a software-only problem.

It turned out the rover had become trapped in a cycle of continual reboots, crashing each time it tried to access the two flash-memory devices it uses for storage of images and other data. The cycle of some 60 reboots over 30 hours also prevented Spirit from going to sleep overnight when no solar power was available, causing it to run down its batteries.

To get the robots software to work normally, JPL had to disable the flash-memory devices so the onboard computer would boot using only Random Access Memory (RAM), which stores information for active use by a computer only when power is present.

But this left Spirit operating in a crippled state, since data held in RAM evaporates when a computer is powered down. Like a digital camera, the rover uses flash memory for temporary storage of data to be sent to Earth later. But apparently the scientists had hoarded data too aggressively, filling flash storage with data collected during the cruise between Earth and Mars as well as data from the surface exploration. Eventually, just keeping track of all those files consumed so much memory that Spirits software was unable to function normally.
Next Page: Solving Spirits crippled state of operations.

David F. Carr David F. Carr is the Technology Editor for Baseline Magazine, a Ziff Davis publication focused on information technology and its management, with an emphasis on measurable, bottom-line results. He wrote two of Baseline's cover stories focused on the role of technology in disaster recovery, one focused on the response to the tsunami in Indonesia and another on the City of New Orleans after Hurricane Katrina.David has been the author or co-author of many Baseline Case Dissections on corporate technology successes and failures (such as the role of Kmart's inept supply chain implementation in its decline versus Wal-Mart or the successful use of technology to create new market opportunities for office furniture maker Herman Miller). He has also written about the FAA's halting attempts to modernize air traffic control, and in 2003 he traveled to Sierra Leone and Liberia to report on the role of technology in United Nations peacekeeping.David joined Baseline prior to the launch of the magazine in 2001 and helped define popular elements of the magazine such as Gotcha!, which offers cautionary tales about technology pitfalls and how to avoid them.

Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel