Read on Omnivore
Read Original

Highlights&&Note

feature在CNN中也被成为卷积核(filter),一般是3X3,或者5X5的大小
^3ecb3c65

卷积神经网络在本质和原理上还是和卷积运算有一定的联系的
^e3448be7

好了,经过一系列卷积对应相乘,求均值运算后,我们终于把一张完整的feature map填满了。

^3775cc34

非线性激活层

卷积层对原图运算多个卷积产生一组线性激活响应,而非线性激活层是对之前的结果进行一个非线性的激活响应。
指的是对卷积得到的结果再次处理.
非线性激活使用的就是非线性激活函数
^a7676697

卷积操作后,我们得到了一张张有着不同值的feature map,尽管数据量比原图少了很多,但还是过于庞大(比较深度学习动不动就几十万张训练图片),因此接下来的池化操作就可以发挥作用了,它最大的目标就是减少数据量。

池化分为两种,Max Pooling 最大池化、Average Pooling平均池化。顾名思义,最大池化就是取最大值,平均池化就是取平均值。
^251b2749

拿最大池化举例:选择池化尺寸为2x2,因为选定一个2x2的窗口,在其内选出最大值更新进新的feature map。

同样向右依据步长滑动窗口。

阅读更多

Read on Omnivore
Read Original

Highlights&&Note

Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other hardware accelerators. In fact, tensors and NumPy arrays can often share the same underlying memory, eliminating the need to copy data (see Bridge with NumPy).
Tensor与np.ndarray非常相似 ^baaea2f8

Tensors can be created from NumPy arrays (and vice versa
^60681efc

Bridge with NumPy

Tensors on the CPU and NumPy arrays can share their underlying memory locations, and changing one will change the other.

Tensor to NumPy array

t = torch.ones(5)
print(f”t: {t}”)
n = t.numpy()
print(f”n: {n}”)

t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]

A change in the tensor reflects in the NumPy array.

t.add_(1)
print(f”t: {t}”)
print(f”n: {n}”)

阅读更多

Read on Omnivore
Read Original

Highlights&&Note

torch.utils.data.DataLoader and torch.utils.data.Dataset.Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset.
Dataset 用于定义和存储数据及标签,而 DataLoader 则将 Dataset 包装成一个可迭代对象,方便进行批量数据处理和多线程数据加载。 ^40d33fcf

PyTorch offers domain-specific libraries such as TorchText,TorchVision, and TorchAudio
^03b16e1a

To define a neural network in PyTorch, we create a class that inherits from nn.Module. We define the layers of the network in the __init__ function and specify how data will pass through the network in the forward function. To accelerate operations in the neural network, we move it to the GPU or MPS if available.

Using cuda device
NeuralNetwork(
(flatten): Flatten(startdim=1, end_dim=-1)
(linear_relu_stack): Sequential(
(0): Linear(in_features=784, out_features=512, bias=True)
(1): ReLU()
(2): Linear(in_features=512, out_features=512, bias=True)
(3): ReLU()
(4): Linear(in_features=512, out_features=10, bias=True)
)
)
继承自nn.Module的类来定义一个新的神经网络_ ^60509680

Content

Note

Click hereto download the full example code

Learn the Basics ||Quickstart ||Tensors ||Datasets & DataLoaders ||Transforms ||Build Model ||Autograd ||Optimization ||Save & Load Model

Quickstart

阅读更多

Read on Omnivore
Read Original

Highlights&&Note

  • Understand the difference between one-, two- and n-dimensional arrays in NumPy;
  • Understand how to apply some linear algebra operations to n-dimensional arrays without using for-loops;
  • Understand axis and shape properties for n-dimensional arrays.
    理解不同维度的数组
    不适用for循环计算数组
    n-数组的axis与shape值
    ^39b79e0f

For example, the array for the coordinates of a point in 3D space,[1, 2, 1], has one axis. That axis has 3 elements in it, so we say it has a length of 3. In the example pictured below, the array has 2 axes. The first axis has a length of 2, the second axis has a length of 3.

[[1., 0., 0.],
[0., 1., 2.]]
^b040311e

ndarray.shape

the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of theshape tuple is therefore the number of axes, ndim.
对于矩阵[n,m]
shape是矩阵的形状,(n,m)
shape的length是矩阵的维度ndim,2
^6158dbf4

ndarray.size

the total number of elements of the array. This is equal to the product of the elements of shape.
^5372022b

ndarray.dtype

阅读更多

Read on Omnivore
Read Original

Highlights&&Note

Python is simpler to use, available on Windows, macOS, and Unix operating systems, and will help you get the job done more quickly.
更加轻量级的脚本语言 ^bafd84be

Python is simple to use, but it is a real programming language, offering much more structure and support for large programs than shell scripts or batch files can offer.
相较于shell与batch files,能够提供更多接近编程语言的特性 ^51317e08

Python is an interpreted language
_- 编译性语言

  • 解释性语言
    • 脚本语言
      脚本语言是解释性语言的一种.
      Python is scripting language. But it is more than just a scripting language._ ^b49ea4e9

Content

If you do much work on computers, eventually you find that there’s some task you’d like to automate. For example, you may wish to perform a search-and-replace over a large number of text files, or rename and rearrange a bunch of photo files in a complicated way. Perhaps you’d like to write a small custom database, or a specialized GUI application, or a simple game.

If you’re a professional software developer, you may have to work with several C/C++/Java libraries but find the usual write/compile/test/re-compile cycle is too slow. Perhaps you’re writing a test suite for such a library and find writing the testing code a tedious task. Or maybe you’ve written a program that could use an extension language, and you don’t want to design and implement a whole new language for your application.

Python is just the language for you.

You could write a Unix shell script or Windows batch files for some of these tasks, but shell scripts are best at moving around files and changing text data, not well-suited for GUI applications or games. You could write a C/C++/Java program, but it can take a lot of development time to get even a first-draft program. ==Python is simpler to use, available on Windows, macOS, and Unix
operating systems, and will help you get the job done more quickly.==

Python is simple to use, but it is a real programming language, offering much more structure and support for large programs than shell scripts or batch files can offer. On the other hand, Python also offers much more error checking than C, and, being a very-high-level language, it has high-level data types built in, such as flexible arrays and dictionaries. Because of its more general data types Python is applicable to a much larger problem domain than Awk or even Perl, yet many things are at least as easy in Python as in those languages.

阅读更多

Read on Omnivore
Read Original

Highlights&&Note

因此,整个demo的核心实际上是VisualizationDemo

模型处理输入得到输出predictions=self.predictor(image)predictions就是模型(刚刚的self.predictor)输出的结果。阅读机器学习、深度学习代码最重要的就是追踪这类模型处理数据的代码,因为这类代码是理解整体计算模型的关键。

predictor.py9-12行的import部分,我们可以学习到很多架构深度学习项目的规范、设计方法,在不同的文件夹中,我们往往会通过功能将不同的模块分开包装。例如在predictor.py中体现的:

  • .data:处理数据相关的类和方法
  • .engine:对训练、预测逻辑的整体包装,类似于对整体Pipeline的定义,常见于大型项目
  • .utils:应该是utilities的简写,一般用来放置常用的工具模块,例如在这里体现出来的可视化部分

总之,对于越大型的项目来说,合理的分区、包装就越有必要,因为这可以从软件工程角度节省大量用来理解、开发、查错(Debug)的成本。在自己的很多小项目中,合理地使用类似的方法也能有效地提升项目质量。

训练代码是tools/train_net.py

if __name__ == '__main__'的部分(这里是代码运行的接口),可以看到detectron2的结构是利用launch运行了main函数中的内容。如果我们不关心分布式训练的部分(即在distributed的作用域的部分),那么main函数的逻辑相当简单:得到模型和运行的参数(在参数cfg中)。利用定义好的类Trainer,通过传入cfg参数可以定义出模型,后面的部分均通过Trainer里面的方法都可以实现,例如train()顾名思义就是做训练的,test()就是测试的,build_model()就是创建模型的。
很有意思的是,过去的detectron2没有封装invoke_main.不知道为什么要封装这么个东西

Trainer的定义中我们发现它是一个继承自engine.DefaultTrainer的子类,而我们通过上面对main函数的分析发现Trainer的主要功能其实都来自于DefaultTrainer
trainer看engine.DefaultTrianer就行

查看文件名字和每个文件上面__all__的部分可以大致猜测出它们之间相互引用的关系和每个文件主要负责的部分

defaults.py包含了我们在train_net.py中见到的DefaultTrainer,也大多是在import别人

launch.py中的launch顾名思义是让算法开始运行的代码,我们浏览一下它最主要的函数launch,根据它的参数num_machinesmachine_rank可得知它是负责分布式训练的代码,又有参数中main_func,可以知道launch不涉及detectron2的实际功能
分布式训练,不用管

我们清晰了engine部分的层次关系。具体而言,我们按照如下的顺序阅读代码

  • train_loop.py
  • hook.py
  • defaults.py

HookBaseHook的基类,其中实现了方法before_stepbefore_trainafter_stepafter_train,其主要的作用是在真正做训练之前,做好每一步的准备工作。针对不同的Trainer可以使用不同的Hook。Hook翻译过来叫做“钩子”,所以我们可以形象地理解成Hook像在训练首尾的两个钩子一样挂着负责训练的Trainer

TrainerBase中定义了多个Hook,并且在Trainerbefore_stepafter_step等函数中可以看到需要执行每一个Hook在训练之前的准备动作HookBase.before_step、训练之后的收尾动作HookBase.after_step。具体的训练过程非常正常,就是按照iteration的数量运行before_steprun_stepafter_step三个函数。
在 Python 中,hook通常是指在特定时刻自动执行的函数。

SimpleTrainer中作者实现了一种最基本的训练神经网络的流程,它是作为上一段中TrainerBase的子类出现的。它最主要的工作就是将TrainerBase中没有实现的run_step方法实现。事实上,在SimpleTrainer中实现的过程也是最通用的训练过程:

  • iter(dataloader)读取数据
  • loss_dict = self.model(data)计算每个batch的loss
  • self.optimizer.zero_gradlosses.backward()self.optimizer.step()实现训练的过程
  • 通过统一的结构_write_metrics记录、打印计算的指标。
阅读更多