Input:
[batchSize, imageY, imageX, channels]
[imageY, imageX]
Kernel:
[kernelY, kernelX, channels, filters]
[kernelY, kernelX]
Output:
[imageY2, imageX2]
[batchSize, imageY2, imageX2, channels * filters]
Ohne FPGA-seitigem Speicher
FPGA Recheneinheiten werden verteilt (batchSize * channels * filters)
Mal verwendet.
for sample in range(batchSize):
for channnel in range(channels):
for filter in range(filters):
output[sample][channel + filter * channels] = f(
input[sample][channel],
kernel[channel][filter]
)