The most efficient IPC mechanism on C6678

Hi,

I'm trying to find the most efficient way to exchange "data" between several cores on the C6678 DSP. I've attached a figure of the basic application topology.

Background information:

We have two independent real-time digital signal processing (RT-DSP) paths for channel 1 and channel 2. The total workload of each data-plane path is partitioned among two cores (core-pipelining). Further, co-processing-cores provide different signal parameters to the RT-DSP-Tasks running on the data-plane.

Requirements:

What I mean "with the most efficient way to exchange data between cores" is in terms of minimal latency. Further, with "minimal latency" is not meant a few microseconds but rather a few core clock cycles. The size of the exchanged data packages between the cores is configurable and ranges from one single data samples (single precision floating-point) --> 4Byte up to several tenth of data samples --> max. 256Byte. The data flow between all cores is absolutely regular and invariant --> continuous data streaming application.

The hard timing constraints (for completing the entire data-plane path (RT-DSP#1 and RT-DSP#2) for a single data sample) are at least 200ns and must not be exceeded! These results in only 250 instruction cycles @1.25GHz core clock. The total complexity of the signal processing, which is completed on the data plane, is lower than 400 instruction. Partitioning the total workload among two cores results in a total workload for RT-DSP#1 and RT-DSP#1 lower than 200 instruction. That's why each single instruction cycle counts...

Solution Possibilities:

What I surely can say is that the application will not use any OS (e.g. SYS/BIOS, etc.). I would guess that an OS-based IPC mechanisms would surely take several hundreds of core clock cycles.

So I could get an rough overview about the C6678 architecture and have now a little idea what different mechanism for managing IPC tasks are available.

Only to name some:

EDMA3 (-->for signaling events after data exchange completion via HW-IRQ)
Multicore Navigator incl. PKTDMA and QM (-->for signaling events and data exchange)
Semaphore2 hardware (-->only for signaling events. data exchange is done via shared memory)
Notification/Signaling via SW-Interrupts (-->only for signaling events. data exchange is done via shared memory)
Shared memory (MSMC-RAM) (-->for signaling events and data exchange)
Direct access to local L2 memory of destination core-pack (-->for signaling events and data exchange)
etc.

Possibly you could complete the above list with additional techniques and give me an advise what to look at first in more detail and why (ideally it should be the most promising mechanism regarding the requirements mentioned above). I try to avoid writing a benchmark application for every provided IPC mechanism.

Kind regards,

Viktor

The most efficient IPC mechanism on C6678

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112