Hello Experts,
recently I read reference guides of EVE and VCOP and I didnt understand exactly how many variables/values EVE can calculate and process per cycle. In one of the documents its said that VCOP can handle 16 registers x 8 lanes (x 40 bit values).
1. Does that mean that 16x8=128 (40bit?) values can be calculated every cycle?
2. Does VCOP process the 16 vector registers parallel per cycle or one after another?
3. What exactly are those calculations that occur every cycle? Are they an entire loop in the VCOP kernelC file (with several instructions and calculations)? Or are they on single basic expression (like VADD, VCMOVE etc.)?
Thanks in advance,
Tobias