• 问答
  • 技术
  • 实践
  • 资源
ResNet、ResNeX 好兄弟代码复现和解析
技术讨论

作者 | 小马
来源 | FightingCV
编辑 | 极市平台

1 ResNet

1.1 简介

ResNet是CVPR2016最佳论文奖,可以说后面深度学习的很大发展都是基于ResNet。如上图所示,在ResNet之前,以前的网络都比较浅,GoogleNet也只有22层。ResNet是解决了深度CNN模型难训练的问题,15年的ResNet多达152层。那为什么以前的网络不在深度上堆叠呢,就是因为如果暴力堆叠网络,网络的性能不会上升,反而会退化。

从上图可以看出,网络深度增加时,网络准确度出现饱和,甚至出现下降。深度网络的退化问题说明深度网络是不容易训练的。

因此,作者就提出恒等映射(Identity mapping)层。在这种情况下,深层网络应该至少和浅层网络性能一样,不会出现退化的现象。

1.2 网络结构

1.3 代码实现

完整复现代码地址:https://github.com/xmu-xiaoma666/External-Attention-pytorch/blob/master/backbone_cnn/resnet.py

1)首先定义一个Block类(这部分就是一个1x1+3x3+1x1卷积组成的block):

class BottleNeck(nn.Module):
    expansion = 4
    def __init__(self,in_channel,channel,stride=1,downsample=None):
        super().__init__()

        self.conv1=nn.Conv2d(in_channel,channel,kernel_size=1,stride=stride,bias=False)
        self.bn1=nn.BatchNorm2d(channel)

        self.conv2=nn.Conv2d(channel,channel,kernel_size=3,padding=1,bias=False,stride=1)
        self.bn2=nn.BatchNorm2d(channel)

        self.conv3=nn.Conv2d(channel,channel*self.expansion,kernel_size=1,stride=1,bias=False)
        self.bn3=nn.BatchNorm2d(channel*self.expansion)

        self.relu=nn.ReLU(False)

        self.downsample=downsample
        self.stride=stride

    def forward(self,x):
        residual=x

        out=self.relu(self.bn1(self.conv1(x))) #bs,c,h,w
        out=self.relu(self.bn2(self.conv2(out))) #bs,c,h,w
        out=self.relu(self.bn3(self.conv3(out))) #bs,4c,h,w

        if(self.downsample != None):
            residual=self.downsample(residual)

        out+=residual
        return self.relu(out)

2)然后基于上面定义的Block实例化整个网络,整个网络由三部分组成stem layer+main layer+classifier,其中 stem layer就是用于迅速降低空间维度;main layer为网络的主体,用来提取复杂特征;classifier是一个FC层,用于分类:

class ResNet(nn.Module):
    def __init__(self,block,layers,num_classes=1000):
        super().__init__()
        #定义输入模块的维度
        self.in_channel=64
        ### stem layer
        self.conv1=nn.Conv2d(3,64,kernel_size=7,stride=2,padding=3,bias=False)
        self.bn1=nn.BatchNorm2d(64)
        self.relu=nn.ReLU(False)
        self.maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=0,ceil_mode=True)

        ### main layer
        self.layer1=self._make_layer(block,64,layers[0])
        self.layer2=self._make_layer(block,128,layers[1],stride=2)
        self.layer3=self._make_layer(block,256,layers[2],stride=2)
        self.layer4=self._make_layer(block,512,layers[3],stride=2)

        #classifier
        self.avgpool=nn.AdaptiveAvgPool2d(1)
        self.classifier=nn.Linear(512*block.expansion,num_classes)
        self.softmax=nn.Softmax(-1)

    def forward(self,x):
        ##stem layer
        out=self.relu(self.bn1(self.conv1(x))) #bs,112,112,64
        out=self.maxpool(out) #bs,56,56,64

        ##layers:
        out=self.layer1(out) #bs,56,56,64*4
        out=self.layer2(out) #bs,28,28,128*4
        out=self.layer3(out) #bs,14,14,256*4
        out=self.layer4(out) #bs,7,7,512*4

        ##classifier
        out=self.avgpool(out) #bs,1,1,512*4
        out=out.reshape(out.shape[0],-1) #bs,512*4
        out=self.classifier(out) #bs,1000
        out=self.softmax(out)

        return out

    def _make_layer(self,block,channel,blocks,stride=1):
        # downsample 主要用来处理H(x)=F(x)+x中F(x)和x的channel维度不匹配问题,即对残差结构的输入进行升维,在做残差相加的时候,必须保证残差的纬度与真正的输出维度(宽、高、以及深度)相同
        # 比如步长!=1 或者 in_channel!=channel&self.expansion
        downsample = None
        if(stride!=1 or self.in_channel!=channel*block.expansion):
            self.downsample=nn.Conv2d(self.in_channel,channel*block.expansion,stride=stride,kernel_size=1,bias=False)
        #第一个conv部分,可能需要downsample
        layers=[]
        layers.append(block(self.in_channel,channel,downsample=self.downsample,stride=stride))
        self.in_channel=channel*block.expansion
        for _ in range(1,blocks):
            layers.append(block(self.in_channel,channel))
        return nn.Sequential(*layers)

3)实例化不同结构和参数的网络:

def ResNet50(num_classes=1000):
    return ResNet(BottleNeck,[3,4,6,3],num_classes=num_classes)

def ResNet101(num_classes=1000):
    return ResNet(BottleNeck,[3,4,23,3],num_classes=num_classes)

def ResNet152(num_classes=1000):
    return ResNet(BottleNeck,[3,8,36,3],num_classes=num_classes)

1.4 调用ResNet

from backbone_cnn.resnet import ResNet50,ResNet101,ResNet152
import torch

if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    resnet50=ResNet50(1000)
    # resnet101=ResNet101(1000)
    # resnet152=ResNet152(1000)
    out=resnet50(input)
    print(out.shape)

2 ResNeXt

2.1 简介

ResNeXt是CVPR2017的一篇文章,是基于ResNet做的一些改动。在本文中,作者提出了一个高度模块化的网络体系结构。ResNeXt和ResNet一样,也是通过一系列的block来构建,每个block聚合了一组有相同拓扑的转换。这种设计产生了一个同构的、多分支的架构。另外,除了除了深度和宽度之外的一个关键因子,这个策略揭示了一个新的维度,"基数-cardinality "。(关于cardinality,这个知乎的解释真的不错:https://www.zhihu.com/question/323424817/answer/107870476

作者通过实验表明,在控制复制度的受限情况下,增加基数可以提升分类精度。而且当模型复杂度增加时,提升基数比更深或更宽更加有效。

2.2 网络结构

2.3 代码实现

完整复现代码地址:https://github.com/xmu-xiaoma666/External-Attention-pytorch/blob/master/backbone_cnn/resnext.py

在代码实现上,ResNeXt真的是简单,虽然这篇文章中有将ResNeXt对标Inception,但是在代码实现上,ResNeXt真的比Inception简单一万倍。

相比于ResNet,ResNeXt主要有两处不同:

1)第一是expansion参数改变了(ResNet中,expansion=4;ResNeXt那种,expansion=2)

2)第二是ResNeXt加入了基数这个概念,在实现上就是卷积的group。(只需要在ResNet的代码上3x3的卷积模块上加入group=32)

流程和ResNet一样:

1)首先定义一个Block类(这部分就是一个1x1+3x3+1x1卷积组成的block,不同的是这里3x3的Conv上要加group参数,expansion 变成了2):

class BottleNeck(nn.Module):
    expansion = 2
    def __init__(self,in_channel,channel,stride=1,C=32,downsample=None):
        super().__init__()

        self.conv1=nn.Conv2d(in_channel,channel,kernel_size=1,stride=stride,bias=False)
        self.bn1=nn.BatchNorm2d(channel)

        self.conv2=nn.Conv2d(channel,channel,kernel_size=3,padding=1,bias=False,stride=1,groups=C)
        self.bn2=nn.BatchNorm2d(channel)

        self.conv3=nn.Conv2d(channel,channel*self.expansion,kernel_size=1,stride=1,bias=False)
        self.bn3=nn.BatchNorm2d(channel*self.expansion)

        self.relu=nn.ReLU(False)

        self.downsample=downsample
        self.stride=stride

    def forward(self,x):
        residual=x

        out=self.relu(self.bn1(self.conv1(x))) #bs,c,h,w
        out=self.relu(self.bn2(self.conv2(out))) #bs,c,h,w
        out=self.relu(self.bn3(self.conv3(out))) #bs,4c,h,w

        if(self.downsample != None):
            residual=self.downsample(residual)

        out+=residual
        return self.relu(out)

2)然后基于上面定义的Block实例化整个网络,同ResNet一样,整个网络由三部分组成stem layer+main layer+classifier,:

class ResNeXt(nn.Module):
    def __init__(self,block,layers,num_classes=1000):
        super().__init__()
        #定义输入模块的维度
        self.in_channel=64
        ### stem layer
        self.conv1=nn.Conv2d(3,64,kernel_size=7,stride=2,padding=3,bias=False)
        self.bn1=nn.BatchNorm2d(64)
        self.relu=nn.ReLU(False)
        self.maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=0,ceil_mode=True)

        ### main layer
        self.layer1=self._make_layer(block,128,layers[0])
        self.layer2=self._make_layer(block,256,layers[1],stride=2)
        self.layer3=self._make_layer(block,512,layers[2],stride=2)
        self.layer4=self._make_layer(block,1024,layers[3],stride=2)

        #classifier
        self.avgpool=nn.AdaptiveAvgPool2d(1)
        self.classifier=nn.Linear(1024*block.expansion,num_classes)
        self.softmax=nn.Softmax(-1)

    def forward(self,x):
        ##stem layer
        out=self.relu(self.bn1(self.conv1(x))) #bs,112,112,64
        out=self.maxpool(out) #bs,56,56,64

        ##layers:
        out=self.layer1(out) #bs,56,56,128*2
        out=self.layer2(out) #bs,28,28,256*2
        out=self.layer3(out) #bs,14,14,512*2
        out=self.layer4(out) #bs,7,7,1024*2

        ##classifier
        out=self.avgpool(out) #bs,1,1,1024*2
        out=out.reshape(out.shape[0],-1) #bs,1024*2
        out=self.classifier(out) #bs,1000
        out=self.softmax(out)

        return out

    def _make_layer(self,block,channel,blocks,stride=1):
        # downsample 主要用来处理H(x)=F(x)+x中F(x)和x的channel维度不匹配问题,即对残差结构的输入进行升维,在做残差相加的时候,必须保证残差的纬度与真正的输出维度(宽、高、以及深度)相同
        # 比如步长!=1 或者 in_channel!=channel&self.expansion
        downsample = None
        if(stride!=1 or self.in_channel!=channel*block.expansion):
            self.downsample=nn.Conv2d(self.in_channel,channel*block.expansion,stride=stride,kernel_size=1,bias=False)
        #第一个conv部分,可能需要downsample
        layers=[]
        layers.append(block(self.in_channel,channel,downsample=self.downsample,stride=stride))
        self.in_channel=channel*block.expansion
        for _ in range(1,blocks):
            layers.append(block(self.in_channel,channel))
        return nn.Sequential(*layers)

3)实例化不同结构和参数的网络:

def ResNeXt50(num_classes=1000):
    return ResNeXt(BottleNeck,[3,4,6,3],num_classes=num_classes)

def ResNeXt101(num_classes=1000):
    return ResNeXt(BottleNeck,[3,4,23,3],num_classes=num_classes)

def ResNeXt152(num_classes=1000):
    return ResNeXt(BottleNeck,[3,8,36,3],num_classes=num_classes)

2.4 调用ResNet

if __name__ == '__main__':
    input=torch.randn(50,3,224,224)
    resnext50=ResNeXt50(1000)
    # resnext101=ResNeXt101(1000)
    # resnext152=ResNeXt152(1000)
    out=resnext50(input)
    print(out.shape)
  • 2
  • 0
  • 474
收藏
暂无评论