I’ve encountered a couple of issues with peripheral initialisation on the C6678. I have workarounds for both which seem reliable but I would really like to understand what the issues are. This will allow me to be confident both that the workarounds are robust and that we are unlikely to see any further such issues.
Issue 1
I currently use 8 of the maps within the sRIO RXU to direct messages onto a different queue for each core. Each core initialises its own map entry and, since the Rx channels have to be disabled and re-enabled around this programming, I use a semaphore to mutually exclude the cores, so that only one will be writing to the registers at any given time. After a fair amount of debugging, I found that the queue programming was not reliable: of the order of 1% of the time, writing the RXU_MAP_L, RXU_MAP_H and RXU_MAP_QID (in that order) for entry n would corrupt the RXU_MAP_L register for entry (n + 1). The corrupted value would always have the top 24-bits as 0, with the bottom 8-bits having a value not obviously related to the values written. My workaround, which appears reliable, is to read the RXU_MAP_L register for entry (n + 1) before programming entry n, then check whether entry (n + 1) has been corrupted and reinstate it if so.
Issue 2
During initialisation, I was getting a configuration bus error, with error “Write Error” and status “Success”. What does this mean? Is there a register which gives the faulting address, like there is for the other memory protection units? When this was occurring, I was able to narrow the cause down to the initialisation of one of the EDMA3 instances that is used by all 8 cores to transfer data over the PCIe interface. During initialisation – and after core 0 does the fundamental initialisation of the peripheral – all the cores were writing to the same set of a few registers (actually, this bit of code should be re-factored so that only one core writes these registers in the peripheral and the remaining cores only initialise their own state but what’s there works until I have the time to do this) so I tried, on a whim, mutually excluding this portion of the initialisation and I have not seen the problem again. Unless this has simply shifted the timing and that has caused the problem to become benign, this implies that there is a requirement to mutually exclude certain types of access to some registers on the configuration bus. I could not find this specified in the documentation on the device; please could you provide further guidance on this.
Note that I have been developing using TMX silicon, if that makes a difference, though Issue 1 has definitely been seen on TMS silicon as well.
Regards,
SPH