Bugs and Crashes

Now that the IrCOMM code will be used more widely, there should be an explanation about the current bugs and why crashes occur.

The most common symptoms of something going wrong are:

  1. a spontaneous restart of the Newton
  2. an error message saying “Sorry a problem has occurred” with a positive, random error number
  3. stalled communication between the Newton and the peer device
  4. crash of the peer device
  5. no connection to the peer device
  6. failure of an upper level protocol (PPP, TCP/IP, HTTP)

Symptoms 1 and 2 hint at a mistreatment of some of the internal IrDA data. One example is the treatment of a IrLMP control packet as a data packet. I’ve tried to make the code robust for these cases but it depends mostly on the packet types sent by the peer device (and thus on the peer’s IrDA stack implementation) and the level of my successful reverse engineering of the IrDA and comm layer.

Symptom 3 happens when there is either a protocol error (i.e. one side sends packets the other side doesn’t handle in its current state) or, more likely, there are not enough resources on the Newton to deal with the packet. A typical situation is receiving lots of small packets and handling each packet in a time consuming way. The layers of the comm system are not completely decoupled, meaning that an upper layer can effectively block the whole stack. If that happens, the Newton sends RNR packets to the peer, telling that it is currently not able to handle data packets. The cure for a situation like this is to increase the number and size of the IrDA receive buffers (currently, I use seven 512 byte buffers).

Symptom 4 is causes by sending a malformed IrCOMM or TinyTP packet. Although I tried to catch all packets leaving the Newton to transform them to correct IrCOMM packets, there might still be a place somewhere where a packet is passed through an unknown channel (this happened with the PPP driver talking directly to the serial comm tool).

Symptom 5 is usually a sign that the peer device and the Newton disagree with IrDA service to use. Currently, the Newton requests the IrDA:IrCOMM service over TinyTP. Peer devices can however provide IrLPT or something else and as a consequence, deny the connection.

Symptom 6 is unrelated to IrCOMM, but it could still be caused by errors in the Nitro code, e.g. when packet data is corruptet.

Anyway, I’m working on all of these…

2003-02-26