Input:
[batchSize, imageX, imageY, channels]
[imageX, imageY]
Kernel:
[kernelX, kernelY, channels, filters]
[kernelX, kernelY]
Output:
[imageX2, imageY2]
[batchSize, imageX2, imageY2, channels * filters]
Ohne FPGA-seitigem Speicher
FPGA Recheneinheiten werden verteilt (batchSize * channels * filters)
Mal verwendet.
for sample in range(batchSize):
for channnel in range(channels):
for filter in range(filters):
output[sample][channel + filter * channels] = f(
input[sample][channel],
kernel[channel][filter]
)