CAN Errors / CAN Error States

What are Error Active, Error Passive, and Bus off of CAN Bus?

Just to give a little background to the answer:

In order to prevent malfunctioning nodes from disturbing, or even blocking, an entire system, the CAN protocol implements a sophisticated fault confinement mechanism. The CAN protocol is intended to be orthogonal, i.e. all nodes address faults in the same manner. Fault confinement is provided where each node constantly monitors its performance with regard to successful and unsuccessful message transactions. A Transmit Error Counter (TEC) and a Receive Error Counter (REC) create a metric for communication quality based on historic performance. Each node will act on its own bus status based on its individual history. As a result, a graceful degradation allows a node to disconnect itself from the bus i.e. stop transmitting. This means that a permanently faulty device will cease to be active on the bus (go into Bus Off state), but communications between other nodes can continue unhindered. If the bus media is severed, shorted or suffers from some other failure mode the ability to continue communications is dependent upon the condition and the physical interface used.

Fault confinement is a checking mechanism that makes it possible to distinguish between short disturbances (e.g. switching noise from a nearby power cable couples into the transmission media) and permanent failures (e.g. a node is malfunctioning and disturbs the bus).

Manipulation of the error counters is asymmetric. On a successful transmission, or reception, of a message, the respective error counter is decremented if it had not been at zero. In the case of a transmit or receive error the counters are incremented, but by a value greater than the value they would be decrement by following a successful message transaction.

If a node detects a local error condition (e.g. due to local conducted noise, application software, etc.), its resulting error flag (primary error flag) will subsequently cause all other nodes to respond with an error flag too (secondary error flags). It is important that a distinction is made between the nodes that detected an error first and the nodes which responded to the primary error flag. If a node transmits an active error frame, and it monitors a dominant bit after the sixth bit of its error flag, it considers itself as the node that has detected the error first. In the case where a node detects errors first too often, it is regarded as malfunctioning, and its impact to the network has to be limited. Therefore, a node can be in one of three possible error states:

Error active Both of its error counters are less than 128. It takes part fully in bus communication and signals an error by transmission of an active error frame.This consists of sequence of 6 dominant bits followed by 8 recessive bits, all other nodes respond with the appropriate error flag, in response to the violation of the bit stuffing rule.

Error passive A node goes into error passive state if at least one of its error counters is greater than 127. It still takes part in bus activities, but it sends a passive error frame only, on errors. Furthermore, an error passive node has to wait an additional time (Suspend Transmission Field, 8 recessive bits after Intermission Field) after transmission of a message, before it can initiate a new data transfer. The primary passive error flag consists of 6 passive bits and thus is “transparent” on the bus and will not “jam” communications.

Bus Off If the Transmit Error Counter of a CAN controller exceeds 255, it goes into the bus off state. It is disconnected from the bus (using internal logic) and does not take part in bus activities anymore. In order to reconnect the protocol controller, a so-called Bus Off recovery sequence has to be executed. This usually involves the re-initialization and configuration of the CAN controller by the host system, after which it will wait for 128 * 11 recessive bit times before it commences communication.

CAN Error Confinement Rules