Jeff Waugh @jdub tweeted an interesting article about how the previous Mars rovers turned out to have a fairly serious software problem, and how the JPL engineers diagnosed and fixed it.
Some of the points I found particularly interesting:
- JPL use a propriety, off the shelf operating system to run the rovers (VxWorks)
- having a replica of a live system for debugging is very useful
- leaving debugging tools in the remote system saved the day
- Finding a way to reproduce the error is critical
- Don’t ignore strange behaviour thinking it might just go away
I love reading stories of issues like this and how the engineers fixed them. Well worth reading.