深度可分离卷积-526互联

深度可分离卷积，使用了一些 trick 极大减少卷积所需参数量和计算量。

理解深度可分离卷积

若需要对 12×12×3 的输入使用卷积，获得 8×8×256 的输出，直接的卷积方法是使用 256 个 5×5×3 的卷积核（无 padding、步长为 1，下同）。此时卷积层的参数量为 19200，卷积一次需要进行 1228800 次乘法运算。

换用深度可分离卷积，卷积运算分为两步：深度卷积、点卷积。

依然以 12×12×3 的输入意欲获得 8×8×256 的输出为例。深度卷积步骤希望将 12×12×3 的输入转换为 8×8×3，此时需要 3 个 5×5 的卷积核；点卷积步骤实际会进行 1×1 卷积，需要 256 个 1×1×3 卷积核将刚才的矩阵卷积到 8×8×256 上来。

现在，卷积层参数量 843，卷积一次需要进行 53952 次乘法运算。

简要 PyTorch 代码实现

class depthwise_separable_conv(nn.Module):
    def __init__(self, nin, nout):
        super(depthwise_separable_conv, self).__init__()
        self.depthwise = nn.Conv2d(nin, nin, kernel_size=3, padding=1, groups=nin)
        self.pointwise = nn.Conv2d(nin, nout, kernel_size=1)

    def forward(self, x):
        out = self.depthwise(x)
        out = self.pointwise(out)
        return out

参考来源

深入浅出可分离卷积 https://zhuanlan.zhihu.com/p/155584110
https://discuss.pytorch.org/t/how-to-modify-a-conv2d-to-depthwise-separable-convolution/15843/6