don't assing K2 c*noc x N but only mas (c,noc) x N and store each one after the other
swap to save space ??