Python工具集之科学计算包NumPy基础

1 NumPy基本概念
2 NumPy的優勢
- 2.1 Python List 代碼
- 2.2 NumPy 代碼
3 NumPy数组的创建
4 NumPy數組print呈現
- 4.1 np.set_printoptions(threshold=np.nan)
5 NumPy数组算术运算
6 NumPy通用函数
7 NumPy索引切片迭代
8 NumPy数组形状更改
9 NumPy数组堆叠
10 NumPy數組分割
11 NumPy复制和视图
12 NumPy数组转换
13 NumPy函數與方法參考
14 广播 (Broadcasting)
15 索引
16 NumPy應用案例

NumPy基本概念¶

假設a1是3D空间中的点的坐标 [1, 2, 1] 是rank为1的数组，因为它具有一个轴。该轴的长度为3。

import numpy as np
a1=np.array([1,2,1])
a1

array([1, 2, 1])

a2数组有2个轴。第一个轴（维度）的长度为2，第二个轴（维度）的长度为3。

import numpy as np
a2=np.arange(6).reshape(2, 3)
a2

array([[0, 1, 2],
       [3, 4, 5]])

# 示例
import numpy as np
a3=np.arange(15).reshape(3, 5)
a3

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

np.ndim¶

数组的轴（维度）的个数。在Python世界中，维度的数量被称为rank。

a3.ndim  # 獲取a3的rank(軸的個數)

2

np.shape¶

数组的维度。

这是一个整数的元组，值为数组中每一维（轴）的元素个数（长度），表示每个维度中数组的大小。

对于有n行和m列的矩阵，shape将是(n,m)。

shape元组的长度就是rank或维度的个数 ndim。

a3.shape  # 一個元素爲整數的元組

(3, 5)

np.dtype¶

描述数组中元素类型的对象

可以使用标准的Python类型创建或指定dtype

另外NumPy提供它自己的类型。例如numpy.int32、numpy.int16和numpy.float64。

a3.dtype  # 數組類型

dtype('int64')

a3.dtype.name  # 數組類型的名稱

'int64'

np.sctypeDict.keys() # 完整的NumPy数据类型列表

dict_keys(['?', 0, 'byte', 'b', 1, 'ubyte', 'B', 2, 'short', 'h', 3, 'ushort', 'H', 4, 'i', 5, 'uint', 'I', 6, 'intp', 'p', 7, 'uintp', 'P', 8, 'long', 'l', 'L', 'longlong', 'q', 9, 'ulonglong', 'Q', 10, 'half', 'e', 23, 'f', 11, 'double', 'd', 12, 'longdouble', 'g', 13, 'cfloat', 'F', 14, 'cdouble', 'D', 15, 'clongdouble', 'G', 16, 'O', 17, 'S', 18, 'unicode', 'U', 19, 'void', 'V', 20, 'M', 21, 'm', 22, 'bool8', 'Bool', 'b1', 'float16', 'Float16', 'f2', 'float32', 'Float32', 'f4', 'float64', 'Float64', 'f8', 'float128', 'Float128', 'f16', 'complex64', 'Complex32', 'c8', 'complex128', 'Complex64', 'c16', 'complex256', 'Complex128', 'c32', 'object0', 'Object0', 'bytes0', 'Bytes0', 'str0', 'Str0', 'void0', 'Void0', 'datetime64', 'Datetime64', 'M8', 'timedelta64', 'Timedelta64', 'm8', 'int64', 'Int64', 'i8', 'uint64', 'UInt64', 'u8', 'int32', 'Int32', 'i4', 'uint32', 'UInt32', 'u4', 'int16', 'Int16', 'i2', 'uint16', 'UInt16', 'u2', 'int8', 'Int8', 'i1', 'uint8', 'UInt8', 'u1', 'complex_', 'int0', 'uint0', 'single', 'csingle', 'singlecomplex', 'float_', 'intc', 'uintc', 'int_', 'longfloat', 'clongfloat', 'longcomplex', 'bool_', 'unicode_', 'object_', 'bytes_', 'str_', 'string_', 'int', 'float', 'complex', 'bool', 'object', 'str', 'bytes', 'a'])

创建自定义数据类型¶

自定义数据类型，是一种异构数据类型
可用作记录电子表格或数据库中一行数据的结构
例如
- 一个存储商品库存信息的数据类型
  - 长度为40个字符的字符串，记录商品名称
  - 长度为32位的整数，记录商品的存库数量
  - 长度为32位的单精度浮点数，记录商品价格

a3_01=np.dtype([('name',str,40),('numitems', int),('price', float)])
a3_01

dtype([('name', '<U40'), ('numitems', '<i8'), ('price', '<f8')])

a3_02=np.array([('a',10,3.12),('b',15,3.2)],dtype=a3_01) 
# 使用dtype=a3_01，即自定义数据类型，来创建数组，必须在参数中指定数据类型
a3_02

array([('a', 10, 3.12), ('b', 15, 3.2 )],
      dtype=[('name', '<U40'), ('numitems', '<i8'), ('price', '<f8')])

print(a3_02)

[('a', 10, 3.12) ('b', 15, 3.2 )]

a3_02.shape # 本質上是一個一維數組(2個元素,每個元素是一個列表)

(2,)

print(a3_02[1])
print(type(a3_02[1]))
print(type(list(a3_02[1])))

('b', 15, 3.2)
<class 'numpy.void'>
<class 'list'>

np.itemsize¶

数组中每个元素的字节大小，即在内存中所占的字节数。

例如，元素为 float64 类型的数组的 itemsize 为8（=64/8）

而 complex32 类型的数组的 itemsize 为4（=32/8）。

它等于 ndarray.dtype.itemsize 。

a3.itemsize  # 數組元素的大小

8

np.size¶

数组元素的总个数

等于shape的元素的(連)乘积

a3.size # 數組的大小,即元素個數, 等於shape的元組的元素的連乘積

15

np.nbytes¶

整个数组所占的存储空间
= np.itemsize * np.size

a3.nbytes

120

type()¶

type(a3)  # 數組的類型

numpy.ndarray

np.data¶

該缓冲区包含数组的实际元素。

通常，我们不需要使用此属性，因为我们将 使用索引访问数组中的元素 。

NumPy的優勢¶

NumPy数组在数值运算方面的效率，优于Python提供的list容器
使用NumPy可以在代码中省去很多循环语句，代码更为简洁
NumPy中的ndarray是一个多维数组对象，由两部分组成
1. 实际的数据
2. 描述这些数据的元数据
大部分的数组操作，仅仅修改元数据，而不改变底层实际数据

Python List 代碼¶

def pythonsum(n):
    a=list(range(n))
    b=list(range(n))
    c=[]
    for i in range(len(a)):
        a[i]=i ** 2
        b[i]=i ** 3
        c.append(a[i]+b[i])
    return c
pythonsum(3)

[0, 2, 12]

NumPy 代碼¶

import numpy as np
def numpysum(n):
    a=np.arange(n)**2  # 使用 arange 函数创建包含0~n的整数的 NumPy 数组
    b=np.arange(n)**3
    c=a+b
    return c
numpysum(3)

array([ 0,  2, 12])

NumPy数组的创建¶

np.array() 从常规Python列表或元组中创建数组¶

得到的数组的类型是从Python列表中元素的类型推导出来的

import numpy as np
a4 = np.array([2,3,4])
a4

array([2, 3, 4])

a4.dtype  # int64

dtype('int64')

import numpy as np
a5=np.array([1.2, 3.5, 5.1])
a5

array([1.2, 3.5, 5.1])

a5.dtype  # float64

dtype('float64')

数组的类型也可以在创建时明确指定

a6 = np.array( [ [1,2], [3,4] ], dtype=str )
a6

array([['1', '2'],
       ['3', '4']], dtype='<U1')

array 将序列的序列转换成二维数组，将序列的序列的序列转换成三维数组

a7=np.array([(1.5,2,3),(4,5,6),(7,8,9)])
a7

array([[1.5, 2. , 3. ],
       [4. , 5. , 6. ],
       [7. , 8. , 9. ]])

a7.ndim

2

a8=np.array([( (1,2), (3,4) )]) # 序列的個數,由序列的層級確定,序列的層級,即軸(rank)的個數
a8

array([[[1, 2],
        [3, 4]]])

a8.ndim

3

a81=np.array([np.arange(3), np.arange(3,6), np.arange(6,9)]) 
# array根據给定的对象生成数组
# 给定的对象应是 类数组 （arange），这个是 array函数 的唯一 必要参数，其余参数均为有默认值的可选参数
a81

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

通常，数组的元素最初是未知的，但它的大小是已知的
因此，NumPy提供了几个函数来创建具有初始占位符内容的数组
这就减少了数组增长的必要，因为 数组增长的操作花费很大
默认情况下，创建的数组的dtype是 float64

np.arange() 创建序列返回数组而不是列表(类似range)¶

a12=np.arange( 10, 30, 5 ) # 從 10 開始,到 30 結束,每個元素增加 5 
a12

array([10, 15, 20, 25])

a13=np.arange( 0, 2, 0.3 )  # 從 0 開始,到 2 結束,每個元素增加 0.3
a13

array([0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])

np.zeros() 创建一个由0组成的数组¶

a9=np.zeros( (3,4) )
a9

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

np.ones() 创建一个由1組成的数组¶

a10=np.ones( (2,3,4), dtype=np.int16 )
a10

array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int16)

np.full() 創建相同值的若幹元素的數組¶

a10_01=np.full(10, 5.666)
a10_01

array([5.666, 5.666, 5.666, 5.666, 5.666, 5.666, 5.666, 5.666, 5.666,
       5.666])

np.fill() 填充¶

a10_01.fill(999)
a10_01

array([999., 999., 999., 999., 999., 999., 999., 999., 999., 999.])

np.empty() 生成的數組内容随机且取决于存储器状态¶

a11=np.empty( (2,3) )
a11

array([[0.3, 0.6, 0.9],
       [1.2, 1.5, 1.8]])

linspace函數创建序列接收元素数量而不是步长作为参数¶

当 arange 与浮点参数一起使用时，由于浮点数的精度是有限的，通常不可能预测获得的元素数量
出于这个原因，通常 最好使用函数 linspace ，它接收我们想要的元素数量而不是步长作为参数

import numpy as np
a14=np.linspace( 0, 2, 7 )                 # 9 numbers from 0 to 2
a14

array([0.        , 0.33333333, 0.66666667, 1.        , 1.33333333,
       1.66666667, 2.        ])

import numpy as np
from numpy import pi
a15=np.linspace( 0, 2*pi, 10 )        # useful to evaluate function at lots of points
a16=np.sin(a15)
a16

array([ 0.00000000e+00,  6.42787610e-01,  9.84807753e-01,  8.66025404e-01,
        3.42020143e-01, -3.42020143e-01, -8.66025404e-01, -9.84807753e-01,
       -6.42787610e-01, -2.44929360e-16])

np.random.rand() 返回一个或一组服从“0~1”均匀分布的随机样本值¶

a161=np.random.rand(2,2,5)
a161

array([[[9.67402198e-01, 7.40398534e-02, 1.68580764e-02, 9.37147900e-01,
         8.67411623e-01],
        [9.17847783e-01, 9.82103223e-01, 3.21036897e-01, 6.19037492e-01,
         8.03422598e-01]],

       [[9.66868348e-01, 2.25744664e-04, 9.26264806e-01, 4.74807738e-02,
         5.91632295e-01],
        [4.30744214e-02, 7.75219057e-01, 2.46949156e-01, 9.42760345e-01,
         6.23072934e-01]]])

np.random.randint(low, high=None, size=None, dtype='l')¶

Return random integers from low (inclusive) to high (exclusive)

a162=np.random.randint(5, size=(2, 2, 5))
a162

array([[[3, 4, 4, 2, 3],
        [1, 3, 3, 4, 0]],

       [[2, 3, 2, 1, 1],
        [3, 0, 2, 0, 0]]])

NumPy數組print呈現¶

print() 數組与嵌套列表类似方式显示

一维数组被打印为行
二维为矩阵
三维为 矩阵列表

a17=np.arange(6)                         # 1d array
print(a17)

[0 1 2 3 4 5]

a18=np.arange(12).reshape(4,3)           # 2d array
print(a18)

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]

a19=np.arange(24).reshape(2,3,4)         # 3d array 每个切片与下一个用空行分开
print(a19)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]

import numpy as np
print(np.arange(10))

[0 1 2 3 4 5 6 7 8 9]

np.set_printoptions(threshold=np.nan)¶

np.set_printoptions(threshold=np.nan)
print(np.arange(10))  # 將10修改爲10000即可顯示

[0 1 2 3 4 5 6 7 8 9]

NumPy数组算术运算¶

a20=np.array( [20,30,40,50] )
a20

array([20, 30, 40, 50])

數組元素條件¶

a20<35

array([ True,  True, False, False])

10*np.sin(a20)

array([ 9.12945251, -9.88031624,  7.4511316 , -2.62374854])

a21=np.arange( 4 )
a21

array([0, 1, 2, 3])

數組相減 -¶

a22=a20-a21
a22

array([20, 29, 38, 47])

數組開立方 ** 元素級別操作¶

a23=a21**2
a23

array([0, 1, 4, 9])

乘法运算符 * 的运算在NumPy数组中是元素级别的
矩阵乘积可以使用 dot 函数或方法执行

a24=np.array( [[1,1,1], [0,1,0]] )
a24

array([[1, 1, 1],
       [0, 1, 0]])

a25=np.array( [[2,0,1], [3,4,1]] )
a25

array([[2, 0, 1],
       [3, 4, 1]])

* 算術運算¶

a24*a25  # 相同位置上的元素相乘

array([[2, 0, 1],
       [0, 4, 0]])

dot矩陣運算¶

a26=np.array( [[2,0,1], [1,1,2]] )
print(a26)
a26.shape

[[2 0 1]
 [1 1 2]]

(2, 3)

a27=np.array( [[2,0,1,0], [1,1,4,0], [1,1,4,0]] )
print(a27)
a27.shape

[[2 0 1 0]
 [1 1 4 0]
 [1 1 4 0]]

(3, 4)

a28=a26.dot(a27)
print(a28)
a28.shape
# 2*2+0*1+1*1=5, 2*0+0*1+1*1=1, 2*1+0*4+1*4=6, 2*0+0*0+1*0=0
# 1*2+1*1+2*1=5, 1*0+1*1+2*1=3, 1*1+1*4+2*4=13, 1*0+1*0+2*0=0

[[ 5  1  6  0]
 [ 5  3 13  0]]

(2, 4)

np.dot(a26, a27)   # 同 a26.dot(a27)

array([[ 5,  1,  6,  0],
       [ 5,  3, 13,  0]])

+= 和 *= 自增和自乘¶

适用于修改现有数组，而不是创建新数组

a29=np.ones((2,3), dtype=int)
print(a29)
a29 *=3
a29

[[1 1 1]
 [1 1 1]]

array([[3, 3, 3],
       [3, 3, 3]])

a30=np.ones((2,3), dtype=int)
print(a30)
print(a29)
a30 += a29
a30

[[1 1 1]
 [1 1 1]]
[[3 3 3]
 [3 3 3]]

array([[4, 4, 4],
       [4, 4, 4]])

一元运算，例如计算数组中所有元素的总和，作为 ndarray 类的方法实现

a31=np.random.random((3,4))
a31

array([[0.47124497, 0.25283129, 0.0106757 , 0.85918606],
       [0.05744839, 0.75878927, 0.4591551 , 0.97007796],
       [0.58957094, 0.00198441, 0.04586998, 0.49072614]])

np.sum(axis=0/1)¶

a31.sum()

4.967560205915041

a31.sum(axis=0)

array([1.1182643 , 1.01360497, 0.51570078, 2.31999015])

np.max()¶

a31.max()

0.9700779575699309

a31.max(axis=0)

array([0.58957094, 0.75878927, 0.4591551 , 0.97007796])

np.min()¶

a31.min()

0.0019844073380141936

a31.min(axis=1)

array([0.0106757 , 0.05744839, 0.00198441])

np.cumsum(axis=0/1) 累加¶

a31.cumsum(axis=0)  # 计算轴向元素累加和，返回由中间结果组成的数组，返回值是“由中间结果组成的数组”

array([[0.47124497, 0.25283129, 0.0106757 , 0.85918606],
       [0.52869336, 1.01162056, 0.4698308 , 1.82926401],
       [1.1182643 , 1.01360497, 0.51570078, 2.31999015]])

a31.cumsum(axis=1)  # 行累加

array([[0.47124497, 0.72407626, 0.73475196, 1.59393802],
       [0.05744839, 0.81623766, 1.27539277, 2.24547072],
       [0.58957094, 0.59155534, 0.63742532, 1.12815146]])

np.around() 返回四舍五入后的值¶

可指定精度

import numpy as np
a311=np.array([-0.746, 4.6, 9.4, 7.447, 10.455, 15.555])
print(a311)
a312=np.around(a311) 
print(a312)
a313=np.around(a311, decimals=2)
print(a313)
a314=np.around(a311, decimals=-1)
print(a314)

[-0.746  4.6    9.4    7.447 10.455 15.555]
[-1.  5.  9.  7. 10. 16.]
[-0.75  4.6   9.4   7.45 10.46 15.56]
[-0.  0. 10. 10. 10. 20.]

np.floor() 返回不大于输入参数的最大整数¶

可以等於

import numpy as np
a315=np.array([-1.7, -2.5, -0.2, 0.6, 1.2, 2.7, 11])
print(a315)
a316=np.floor(a315)
print(a316)

[-1.7 -2.5 -0.2  0.6  1.2  2.7 11. ]
[-2. -3. -1.  0.  1.  2. 11.]

np.ceil() 返回输入值的上限¶

对于输入 x ，返回最小的整数，使得 i> = x。

import numpy as np
a317=np.array([-1.7, -2.5, -0.2, 0.6, 1.2, 2.7, 11])
print(a317)
a318=np.ceil(a317)
print(a318)

[-1.7 -2.5 -0.2  0.6  1.2  2.7 11. ]
[-1. -2. -0.  1.  2.  3. 11.]

np.where() 返回满足条件的数组元素¶

np.where(condition, x, y)
满足条件(condition)，输出x，不满足输出y

只有条件 (condition)，没有x和y，则输出满足条件 (即非0) 元素的坐标
等价于numpy.nonzero
坐标以tuple的形式给出
通常原数组有多少维，输出的tuple中就包含几个数组，分别对应符合条件元素的各维坐标
返回幾個元素,元組的每個數組就包含幾個元素,各位數組的元素, 組合成坐標

import numpy as np 
a319=np.random.random([2, 3]) 
print(a319)
print(a319.ndim)
print('-'*100)
a320=np.where(a319>0.5, a319, 0)
print(a320)
print('-'*100)
a321=np.where(a319>0.5) # 返回索引
print(a321)
print(a319[a321])

[[0.84690613 0.36558025 0.77342995]
 [0.94935192 0.83646104 0.97409553]]
2
----------------------------------------------------------------------------------------------------
[[0.84690613 0.         0.77342995]
 [0.94935192 0.83646104 0.97409553]]
----------------------------------------------------------------------------------------------------
(array([0, 0, 1, 1, 1]), array([0, 2, 0, 1, 2]))
[0.84690613 0.77342995 0.94935192 0.83646104 0.97409553]

NumPy通用函数¶

如sin，cos和exp, add
In NumPy, these are called “universal functions”( ufunc ).
在NumPy中，这些函数在数组上按元素级别操作，产生一个数组作为输出。

a32=np.arange(3)
print(a32)
a32 **=a32  # lim(x→0+) x^x = 1，换句话说，0^0如果从正数方面趋近，用极限思维的话是收敛于1的
a32

[0 1 2]

array([1, 1, 4])

np.exp() e的冪次方¶

np.exp(a32)  # 返回e的幂次方，e是一个常数为2.71828

array([ 2.71828183,  2.71828183, 54.59815003])

np.sqrt() 平方根¶

np.sqrt(a32) # 平方根

array([1., 1., 2.])

np.add()¶

a33=np.array([2., -1., 4.])
print(a33)
print(a32)
np.add(a32, a33)

[ 2. -1.  4.]
[1 1 4]

array([3., 0., 8.])

np.cbrt() 立方根¶

print(np.add(a32, a33))
np.cbrt(np.add(a32, a33))

[3. 0. 8.]

array([1.44224957, 0.        , 2.        ])

NumPy索引切片迭代¶

a34=np.arange(10)**3
print(a34)

[  0   1   8  27  64 125 216 343 512 729]

a34[2]

8

a34[2:5]

array([ 8, 27, 64])

間隔輸出 [::]¶

a34=np.arange(10)**3
print(a34)
a34[:6:2] = 1000
print(a34)

[  0   1   8  27  64 125 216 343 512 729]
[1000    1 1000   27 1000  125  216  343  512  729]

逆序輸出 [ : :-1]¶

a34[ : :-1]

array([ 729,  512,  343,  216,  125, 1000,   27, 1000,    1, 1000])

print(a34)
for i in a34:
    print(np.cbrt(i)) # np.cbrt() 立方根函數

[1000    1 1000   27 1000  125  216  343  512  729]
10.0
1.0
10.0
3.0000000000000004
10.0
5.0
6.000000000000001
7.0
8.0
9.000000000000002

np.fromfunction¶

# np.fromfunction() 从函数中创建数组, x,y 是數組的索引,從 0 到x,從 0 到y
def f(x,y):
    return 10*x+y
a35=np.fromfunction(f,(5,4),dtype=int)
a35

array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [20, 21, 22, 23],
       [30, 31, 32, 33],
       [40, 41, 42, 43]])

a35[2,3]

23

a35.ndim

2

2維數組切片¶

a35[0:5, 1]

array([ 1, 11, 21, 31, 41])

a35[ : ,1]

array([ 1, 11, 21, 31, 41])

a35[1:3, : ]

array([[10, 11, 12, 13],
       [20, 21, 22, 23]])

a35[-1] # 当提供比轴数更少的索引时，缺失的索引被认为是一个完整切片,即,行索引取-1(最後一行),列索引缺失,所以取全部
# 三个点（ ... ）也可表示产生完整索引元组所需的冒号

array([40, 41, 42, 43])

a35[4,:] # 驗證上一行代碼

array([40, 41, 42, 43])

迭代（Iterating）多维数组是相对于第一个轴完成的

for row in a35:
    print(row)

[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]

a35_01=np.arange(24).reshape(2,3,4)
a35_01

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

a35_01.shape

(2, 3, 4)

a35_01[1,0,::3]  # 在切片数组中间隔选定元素
# 第一個維度的第二個元素 12 ... 15 ... 19 ... 23
# 第二個維度的第一個元素 12 ... 15
# 第三個維度的從0開始間隔3的元素 12, 15

array([12, 15])

a35_01[::-1] # 僅僅對第一個維度逆序

array([[[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]],

       [[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]]])

a35_01[:,:,-1] # 區別:取地一個維度全部,第二個維度全部,第三個維度最後一列

array([[ 3,  7, 11],
       [15, 19, 23]])

np.flat 数组元素迭代器¶

flat 返回一个numpy.flatiter对象, 是获得flatiter的唯一方式
像遍历一维数组一样去遍历任意多维数组
如果想要对数组中的每个元素执行操作，可以使用 flat 属性，该属性是数组中所有元素的迭代器

for element in a35.flat[0:5]:
    print(element)

0
1
2
3
10

NumPy数组形状更改¶

a36=np.floor(10*np.random.random((3,4)))
a36

array([[7., 9., 0., 4.],
       [9., 9., 3., 9.],
       [9., 8., 2., 2.]])

a36.shape

(3, 4)

三个命令都返回一个修改后的数组，但不更改原始数组

np.ravel()¶

由ravel()产生的数组中元素的顺序通常是“C风格”
也就是说，最右边的索引“改变最快”，所以[0,0]之后的元素是[0,1], [0,2], [0,3], [1,0], [1,1], [1,2] ...
如果数组被重新塑造成其他形状，数组又被视为“C-style”
NumPy通常创建按此顺序存储的数组，因此ravel()通常不需要复制其参数
但如果数组是通过切片另一个数组或使用不寻常选项创建的，则可能需要复制它
函数ravel()和reshape()也可以通过使用可选参数来指示使用FORTRAN风格的数组，其中最左侧的索引更改速度最快。

a36.ravel()  # returns the array, flattened

array([7., 9., 0., 4., 9., 9., 3., 9., 9., 8., 2., 2.])

np.flatten() 與 np.ravel() 相同，但会请求内存保存结果¶

a36.flatten()

array([7., 9., 0., 4., 9., 9., 3., 9., 9., 8., 2., 2.])

np.reshape()¶

a36.reshape(6,2)  # returns the array with a modified shape

array([[7., 9.],
       [0., 4.],
       [9., 9.],
       [3., 9.],
       [9., 8.],
       [2., 2.]])

如果在reshape操作中将维度指定为 $-1$ ，则会自动计算其他维度

a36.reshape(2,-1)

array([[7., 9., 0., 4., 9., 9.],
       [3., 9., 9., 8., 2., 2.]])

np.T 轉置¶

a36.T  # returns the array, transposed

array([[7., 9., 9.],
       [9., 9., 8.],
       [0., 3., 2.],
       [4., 9., 2.]])

print(a36.T.shape)
print(a36.shape)

(4, 3)
(3, 4)

np.transpose() 转置¶

理解transpose()中的参数的意义

因为数组a36_01的shape为（2,3,5），是一个三维数组，那么这个元组对应的索引为：（0,1,2）
也就是a36_01.shape的下标：(2[0], 3[1], 5[2])， []中对应的是shape元组的索引
现在，通过b = a36_01.transpose(1, 0, 2)，那么b.shape就变成(3， 2， 5)
这就是说transpose就是改变高维数组的形状
形状改变了，那么里面的元素自然也要重新排列
比如：
- 元素11在a36_01中的位置是a[0][2][1]
- 经过b = a.transpose(1, 0, 2)之后
- 元素11在b中的位置就变成b[2][0][1]
- 比如元素28，在a中的位置a[1][2][3]，在b中为：a[2][1][3]

a36

array([[7., 9., 0., 4.],
       [9., 9., 3., 9.],
       [9., 8., 2., 2.]])

a36.transpose(1,0) # 只是一个视图，不改变原有的 a36

array([[7., 9., 9.],
       [9., 9., 8.],
       [0., 3., 2.],
       [4., 9., 2.]])

import numpy as np
a36_01 = np.array(range(30)).reshape(2, 3, 5)
print(a36_01.shape)
print ("a36_01 = ")
print (a36_01)
print ("\n=====================\n")
print(a36_01.transpose(1, 0, 2).shape)
print ("a36_01.transpose() = ")
print (a36_01.transpose(1, 0, 2))

(2, 3, 5)
a36_01 = 
[[[ 0  1  2  3  4]
  [ 5  6  7  8  9]
  [10 11 12 13 14]]

 [[15 16 17 18 19]
  [20 21 22 23 24]
  [25 26 27 28 29]]]

=====================

(3, 2, 5)
a36_01.transpose() = 
[[[ 0  1  2  3  4]
  [15 16 17 18 19]]

 [[ 5  6  7  8  9]
  [20 21 22 23 24]]

 [[10 11 12 13 14]
  [25 26 27 28 29]]]

np.resize() 會修改数组本身¶

a37=np.array([[ 2.,  8.,  0.,  6.],[ 4.,  5.,  1.,  1.],[ 8.,  9.,  3.,  6.]])
print(a37)
print('-'*100)
a37.resize((2,6))
print(a37)

[[2. 8. 0. 6.]
 [4. 5. 1. 1.]
 [8. 9. 3. 6.]]
----------------------------------------------------------------------------------------------------
[[2. 8. 0. 6. 4. 5.]
 [1. 1. 8. 9. 3. 6.]]

NumPy数组堆叠¶

a38=np.arange(4).reshape(2,2)
print(a38)
print('-'*100)
a39=2*a38  # 广播功能
print(a39)

[[0 1]
 [2 3]]
----------------------------------------------------------------------------------------------------
[[0 2]
 [4 6]]

np.hstack((a,b))¶

np.hstack((a38,a39))

array([[0, 1, 0, 2],
       [2, 3, 4, 6]])

np.vstack((a,b))¶

np.vstack((a38,a39))

array([[0, 1],
       [2, 3],
       [0, 2],
       [4, 6]])

np.concatenate((a,b), axis=0/1)¶

print(np.concatenate((a38,a39), axis=1))
print('-'*100)
print(np.concatenate((a38,a39), axis=0))

[[0 1 0 2]
 [2 3 4 6]]
----------------------------------------------------------------------------------------------------
[[0 1]
 [2 3]
 [0 2]
 [4 6]]

np.dstack((a,b)) # 深度堆疊¶

a38=np.arange(4).reshape(2,2)
print(a38)
print('-'*100)
a39=2*a38  # 广播功能
print(a39)
print('-'*100)
print(np.dstack((a38,a39)))
print('-'*100)
np.dstack((a38,a39)).shape

[[0 1]
 [2 3]]
----------------------------------------------------------------------------------------------------
[[0 2]
 [4 6]]
----------------------------------------------------------------------------------------------------
[[[0 0]
  [1 2]]

 [[2 4]
  [3 6]]]
----------------------------------------------------------------------------------------------------

(2, 2, 2)

比较: $a$ 和 $np.dstack((a))$
转置，并多了一个[]

print(a39)
a39.shape

[[0 2]
 [4 6]]

(2, 2)

print(np.dstack((a39)))
np.dstack((a39)).shape

[[[0 4]
  [2 6]]]

(1, 2, 2)

np.column_stack( (a, b) )¶

a40=np.arange(2)
print(a40)
print('-'*100)
a41=2*a40
print(a41)

[0 1]
----------------------------------------------------------------------------------------------------
[0 2]

print(np.column_stack((a40, a41))) # 注意，此处是 [0,0]和[1,2]
np.column_stack((a40, a41)).shape

[[0 0]
 [1 2]]

(2, 2)

對於二維數組,下列兩個函數的效果相同

np.column_stack((a38, a39))

array([[0, 1, 0, 2],
       [2, 3, 4, 6]])

np.hstack((a38, a39))

array([[0, 1, 0, 2],
       [2, 3, 4, 6]])

np.column_stack((a38, a39))==np.hstack((a38, a39))

array([[ True,  True,  True,  True],
       [ True,  True,  True,  True]])

np.column_stack((a38, a39)).shape

(2, 4)

[:, np.newaxis]¶

import numpy as np
from numpy import newaxis
np.column_stack((a38,a39))     # with 2D arrays

array([[0, 1, 0, 2],
       [2, 3, 4, 6]])

a38

array([[0, 1],
       [2, 3]])

a38.shape

(2, 2)

a38[:,newaxis]               # this allows to have a 2D columns vector

array([[[0, 1]],

       [[2, 3]]])

a38[:,newaxis].shape

(2, 1, 2)

np.column_stack((a38[:,newaxis],a39[:,newaxis]))

array([[[0, 1],
        [0, 2]],

       [[2, 3],
        [4, 6]]])

np.hstack((a38[:,newaxis],a39[:,newaxis]))

array([[[0, 1],
        [0, 2]],

       [[2, 3],
        [4, 6]]])

np.column_stack((a38[:,newaxis],a39[:,newaxis])).shape

(2, 2, 2)

切片後堆疊¶

a38[0:]

array([[0, 1],
       [2, 3]])

a39[1]

array([4, 6])

np.column_stack((a38[0:],a39[1]))

array([[0, 1, 4],
       [2, 3, 6]])

np.row_stack( (a, b) )¶

print(np.row_stack((a40, a41)) )  # 注意，此处是 [0,1]和[0,2]
np.row_stack((a40, a41)).shape

[[0 1]
 [0 2]]

(2, 2)

np.row_stack((a40, a41))

array([[0, 1],
       [0, 2]])

np.vstack((a40, a41))

array([[0, 1],
       [0, 2]])

np.row_stack((a40, a41))==np.vstack((a40, a41))

array([[ True,  True],
       [ True,  True]])

NumPy數組分割¶

import numpy as np
a42=np.arange(24).reshape(4,6)
a42

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])

np.hsplit()¶

指定要返回的均匀划分的数组数量
或,指定要在其后进行划分的列

print(np.hsplit(a42,3))
print('-'*100)
for i in np.hsplit(a42,3):
    print(i)
    print('+'*100)

[array([[ 0,  1],
       [ 6,  7],
       [12, 13],
       [18, 19]]), array([[ 2,  3],
       [ 8,  9],
       [14, 15],
       [20, 21]]), array([[ 4,  5],
       [10, 11],
       [16, 17],
       [22, 23]])]
----------------------------------------------------------------------------------------------------
[[ 0  1]
 [ 6  7]
 [12 13]
 [18 19]]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[[ 2  3]
 [ 8  9]
 [14 15]
 [20 21]]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[[ 4  5]
 [10 11]
 [16 17]
 [22 23]]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

print(np.hsplit(a42,(3,5)))   # Split a after the third and the fourth column
print('-'*100)
for i in np.hsplit(a42,(3,5)):
    print(i)
    print('+'*100)

[array([[ 0,  1,  2],
       [ 6,  7,  8],
       [12, 13, 14],
       [18, 19, 20]]), array([[ 3,  4],
       [ 9, 10],
       [15, 16],
       [21, 22]]), array([[ 5],
       [11],
       [17],
       [23]])]
----------------------------------------------------------------------------------------------------
[[ 0  1  2]
 [ 6  7  8]
 [12 13 14]
 [18 19 20]]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[[ 3  4]
 [ 9 10]
 [15 16]
 [21 22]]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[[ 5]
 [11]
 [17]
 [23]]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

np.vsplit()¶

np.vsplit(a42,2)

[array([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11]]), array([[12, 13, 14, 15, 16, 17],
        [18, 19, 20, 21, 22, 23]])]

np.split(axis=0/1)¶

np.split(a42,2, axis=0)

[array([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11]]), array([[12, 13, 14, 15, 16, 17],
        [18, 19, 20, 21, 22, 23]])]

np.split(a42,2,axis=1)

[array([[ 0,  1,  2],
        [ 6,  7,  8],
        [12, 13, 14],
        [18, 19, 20]]), array([[ 3,  4,  5],
        [ 9, 10, 11],
        [15, 16, 17],
        [21, 22, 23]])]

np.dsplit()¶

a43=np.arange(64).reshape(2,4,8)
print(a43)
print('-'*100)
a44=np.dsplit(a43,2)
print(type(a44))
print(a44)
print('-'*100)
for i in a44:
    print(i)
    print('-'*100)

[[[ 0  1  2  3  4  5  6  7]
  [ 8  9 10 11 12 13 14 15]
  [16 17 18 19 20 21 22 23]
  [24 25 26 27 28 29 30 31]]

 [[32 33 34 35 36 37 38 39]
  [40 41 42 43 44 45 46 47]
  [48 49 50 51 52 53 54 55]
  [56 57 58 59 60 61 62 63]]]
----------------------------------------------------------------------------------------------------
<class 'list'>
[array([[[ 0,  1,  2,  3],
        [ 8,  9, 10, 11],
        [16, 17, 18, 19],
        [24, 25, 26, 27]],

       [[32, 33, 34, 35],
        [40, 41, 42, 43],
        [48, 49, 50, 51],
        [56, 57, 58, 59]]]), array([[[ 4,  5,  6,  7],
        [12, 13, 14, 15],
        [20, 21, 22, 23],
        [28, 29, 30, 31]],

       [[36, 37, 38, 39],
        [44, 45, 46, 47],
        [52, 53, 54, 55],
        [60, 61, 62, 63]]])]
----------------------------------------------------------------------------------------------------
[[[ 0  1  2  3]
  [ 8  9 10 11]
  [16 17 18 19]
  [24 25 26 27]]

 [[32 33 34 35]
  [40 41 42 43]
  [48 49 50 51]
  [56 57 58 59]]]
----------------------------------------------------------------------------------------------------
[[[ 4  5  6  7]
  [12 13 14 15]
  [20 21 22 23]
  [28 29 30 31]]

 [[36 37 38 39]
  [44 45 46 47]
  [52 53 54 55]
  [60 61 62 63]]]
----------------------------------------------------------------------------------------------------

NumPy复制和视图¶

import numpy as np
a45=np.arange(12)
a45

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

简单赋值(=)不会创建数组对象或其数据的拷贝¶

a46=a45 # no new object is created
a46

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

a46 is a45 # a and b are two names for the same ndarray object

True

a46.shape=3,4
a46.shape

(3, 4)

a45.shape

(3, 4)

np.view() 创建一个新数组对象¶

不同的数组对象可以共享相同的数据
numpy.ndarray.view 提供对内存区域不同的切割方式来完成数据类型的转换
无须对数据进行额外的copy，来节约内存空间

a47=a45.view()
a47

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

a47 is a45

False

np.base¶

a47.base is a45  # c is a view of the data owned by a

True

np.flags¶

a45.flags

  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

a47.flags

  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

a45

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

data changes¶

a47[0,0]=100  # a's data changes
a47

array([[100,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11]])

a45

array([[100,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11]])

shape doesn't change¶

a47.shape=2,6  # a's shape doesn't change
a47

array([[100,   1,   2,   3,   4,   5],
       [  6,   7,   8,   9,  10,  11]])

a45

array([[100,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11]])

数组切片,返回的是一个视图¶

若此時使用reshape,返回的是copy,而非view¶

a48=a45[:, 1:3].reshape(2,3)
a48

array([[ 1,  2,  5],
       [ 6,  9, 10]])

a48[:]=666
a48

array([[666, 666, 666],
       [666, 666, 666]])

a48.shape=3,2
a48

array([[666, 666],
       [666, 666],
       [666, 666]])

a45

array([[100,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11]])

若未使用reshape,返回view,此時datas會改變¶

a49=a45[:,2:4]
a49

array([[ 2,  3],
       [ 6,  7],
       [10, 11]])

a49[:]=666
a49

array([[666, 666],
       [666, 666],
       [666, 666]])

a45

array([[100,   1, 666, 666],
       [  4,   5, 666, 666],
       [  8,   9, 666, 666]])

np.copy() 深拷貝¶

a45

array([[100,   1, 666, 666],
       [  4,   5, 666, 666],
       [  8,   9, 666, 666]])

a50=a45.copy() # a new array object with new data is created
a50

array([[100,   1, 666, 666],
       [  4,   5, 666, 666],
       [  8,   9, 666, 666]])

a50 is a45

False

a50.base is a45

False

a50[:]=0
a50

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])

a45

array([[100,   1, 666, 666],
       [  4,   5, 666, 666],
       [  8,   9, 666, 666]])

NumPy数组转换¶

np.tolist() 将numpy数组转换成 python 列表¶

a45

array([[100,   1, 666, 666],
       [  4,   5, 666, 666],
       [  8,   9, 666, 666]])

a45.tolist()

[[100, 1, 666, 666], [4, 5, 666, 666], [8, 9, 666, 666]]

len(a45.tolist())

3

np.astype() 在转换数组时指定数据类型¶

print(a45.dtype)
a45_01=a45.astype(str)
a45_01

int64

array([['100', '1', '666', '666'],
       ['4', '5', '666', '666'],
       ['8', '9', '666', '666']], dtype='<U21')

np.rollaxis(arr, axis, start) 變換數組的軸的位置¶

arr：输入数组
axis：要向后滚动的轴，其它轴的相对位置不会改变
start：默认为零，表示完整的滚动。会滚动到特定位置

print(a43)
print(a43.shape)

[[[ 0  1  2  3  4  5  6  7]
  [ 8  9 10 11 12 13 14 15]
  [16 17 18 19 20 21 22 23]
  [24 25 26 27 28 29 30 31]]

 [[32 33 34 35 36 37 38 39]
  [40 41 42 43 44 45 46 47]
  [48 49 50 51 52 53 54 55]
  [56 57 58 59 60 61 62 63]]]
(2, 4, 8)

a43_01 = np.rollaxis(a43, 0, 3) 
print(a43_01)
print(a43_01.shape)

[[[ 0 32]
  [ 1 33]
  [ 2 34]
  [ 3 35]
  [ 4 36]
  [ 5 37]
  [ 6 38]
  [ 7 39]]

 [[ 8 40]
  [ 9 41]
  [10 42]
  [11 43]
  [12 44]
  [13 45]
  [14 46]
  [15 47]]

 [[16 48]
  [17 49]
  [18 50]
  [19 51]
  [20 52]
  [21 53]
  [22 54]
  [23 55]]

 [[24 56]
  [25 57]
  [26 58]
  [27 59]
  [28 60]
  [29 61]
  [30 62]
  [31 63]]]
(4, 8, 2)

NumPy函數與方法參考¶

数组创建¶

转换¶

手法¶

问题¶

顺序¶

操作¶

基本统计¶

基本线性代数¶

广播 (Broadcasting)¶

a51=np.arange(5)
a51

array([0, 1, 2, 3, 4])

a51 += 4
a51

array([4, 5, 6, 7, 8])

a52=np.array([[0],[1],[2],[3]])
a52

array([[0],
       [1],
       [2],
       [3]])

a52.ndim

2

a52.shape

(4, 1)

a53=np.array([1,2,3])
a53

array([1, 2, 3])

a53.shape

(3,)

a54=a52+a53
a54

array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5],
       [4, 5, 6]])

a54.shape

(4, 3)

索引¶

a55 = np.arange(12)**2  # the first 12 square numbers
a55

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100, 121])

a56 = np.array( [ 1,1,3,8,5 ] )  # an array of indices
a56

array([1, 1, 3, 8, 5])

a55[a56]  # the elements of a55 at the positions a56

array([ 1,  1,  9, 64, 25])

a57=np.array( [ [ 3, 4], [ 9, 7 ] ] )  # a bidimensional array of indices
a57

array([[3, 4],
       [9, 7]])

a55[a57]  # the same shape as a57

array([[ 9, 16],
       [81, 49]])

当被索引的数组 a 是一个多维数组，单个索引数组指的是 a 的第一个维度¶

a58 = np.array([
                [0,0,0],           # black
                [255,0,0],         # red
                [0,255,0],         # green
                [0,0,255],         # blue
                [255,255,255]      # white
              ])
a58

array([[  0,   0,   0],
       [255,   0,   0],
       [  0, 255,   0],
       [  0,   0, 255],
       [255, 255, 255]])

a59=np.array([[ 0, 1, 2, 0 ],# each value corresponds to a color in the a58
              [ 0, 3, 4, 0 ]])
a59

array([[0, 1, 2, 0],
       [0, 3, 4, 0]])

a58[a59]

array([[[  0,   0,   0],
        [255,   0,   0],
        [  0, 255,   0],
        [  0,   0,   0]],

       [[  0,   0,   0],
        [  0,   0, 255],
        [255, 255, 255],
        [  0,   0,   0]]])

a58.shape # 第一個維度,就是包含5個元素的維度

(5, 3)

a59.shape

(2, 4)

a58[a59].shape

(2, 4, 3)

多个维度的索引, 每个维度的索引数组必须具有相同的形状¶

a60 = np.arange(12).reshape(3,4)
a60

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

# indices for the first dim of a
a61 = np.array( [ [0,1], [1,2] ] )
a61

array([[0, 1],
       [1, 2]])

# indices for the second dim
a62 = np.array( [ [2,1], [3,3] ] )
a62

array([[2, 1],
       [3, 3]])

a60[a61,a62]   # a61 and a62 must have equal shape
# 坐標先行後列取值
#(0,2)=2,(1,1)=5
#(1,3)=6,(2,3)=11

array([[ 2,  5],
       [ 7, 11]])

a60[a61,2] # 此處的2相當與根據a61的廣播

array([[ 2,  6],
       [ 6, 10]])

a60[:,a62] # 對a60的第一個維度的三個元素都進行操作
# 相當與
#[0:a62]
#[1:a62]
#[2:a62]

array([[[ 2,  1],
        [ 3,  3]],

       [[ 6,  5],
        [ 7,  7]],

       [[10,  9],
        [11, 11]]])

搜索时间相关序列的最大值¶

a63 = np.linspace(20, 145, 5)                 # time scale
a63

array([ 20.  ,  51.25,  82.5 , 113.75, 145.  ])

a64 = np.sin(np.arange(20)).reshape(5,4)
# 4 time-dependent series
a64

array([[ 0.        ,  0.84147098,  0.90929743,  0.14112001],
       [-0.7568025 , -0.95892427, -0.2794155 ,  0.6569866 ],
       [ 0.98935825,  0.41211849, -0.54402111, -0.99999021],
       [-0.53657292,  0.42016704,  0.99060736,  0.65028784],
       [-0.28790332, -0.96139749, -0.75098725,  0.14987721]])

a65=a64.argmax(axis=0) # index of the maxima for each series
a65

array([2, 0, 3, 1])

a66 = a63[a65]   # times corresponding to the maxima
a66

array([ 82.5 ,  20.  , 113.75,  51.25])

a67=a64[a65, range(a64.shape[1])] # =>data[ind[0],0],data[ind[1],1]...
a67

array([0.98935825, 0.84147098, 0.99060736, 0.6569866 ])

a64.shape

(5, 4)

a64.shape[1]

4

range(a64.shape[1])

range(0, 4)

for i in range(0, 4):
    print(i)
print(type(range(0, 4)))

0
1
2
3
<class 'range'>

使用布尔值作为数组索引¶

a65 = np.arange(12).reshape(3,4)
np.random.shuffle(a65)
a65

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 0,  1,  2,  3]])

a66=(a65>4)&(a65<9) # a66 is a boolean with a's shape
# a66=(a65>4)|(a65<9)  | 是 或 的關系
# where 的多條件實現
a66

array([[False,  True,  True,  True],
       [ True, False, False, False],
       [False, False, False, False]])

a65[a66] # 1d array with the selected elements

array([5, 6, 7, 8])

此属性在赋值时非常有用(條件賦值)¶

a65[a66]=666
a65

array([[  4, 666, 666, 666],
       [666,   9,  10,  11],
       [  0,   1,   2,   3]])

np.ix_() 函数组合不同向量获取每个n-uplet结果¶

a67= np.array([2,3,4,5])
a68= np.array([8,5,4])
a69= np.array([5,4,6,8,3])
print(a67)
print(a68)
print(a69)
print('-'*100)
print(a67.shape)
print(a68.shape)
print(a69.shape)

[2 3 4 5]
[8 5 4]
[5 4 6 8 3]
----------------------------------------------------------------------------------------------------
(4,)
(3,)
(5,)

a671,a681,a691 = np.ix_(a67,a68,a69)
print(a671)
print('-'*100)
print(a681)
print('-'*100)
print(a691)
print('+'*100)
print(a671.shape)
print(a681.shape)
print(a691.shape)

[[[2]]

 [[3]]

 [[4]]

 [[5]]]
----------------------------------------------------------------------------------------------------
[[[8]
  [5]
  [4]]]
----------------------------------------------------------------------------------------------------
[[[5 4 6 8 3]]]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
(4, 1, 1)
(1, 3, 1)
(1, 1, 5)

a70=a671+a681*a691
print(a70)
print(a70.shape)

[[[42 34 50 66 26]
  [27 22 32 42 17]
  [22 18 26 34 14]]

 [[43 35 51 67 27]
  [28 23 33 43 18]
  [23 19 27 35 15]]

 [[44 36 52 68 28]
  [29 24 34 44 19]
  [24 20 28 36 16]]

 [[45 37 53 69 29]
  [30 25 35 45 20]
  [25 21 29 37 17]]]
(4, 3, 5)

a70[3,2,4]

17

a67[3]+a68[2]*a69[4]

17

reduce示例¶

def ufunc_reduce(ufct, *vectors):
   vs = np.ix_(*vectors)
   r = ufct.identity
   for v in vs:
      r = ufct(r,v)
   return r
ufunc_reduce(np.add,a67,a68,a69)

array([[[15, 14, 16, 18, 13],
        [12, 11, 13, 15, 10],
        [11, 10, 12, 14,  9]],

       [[16, 15, 17, 19, 14],
        [13, 12, 14, 16, 11],
        [12, 11, 13, 15, 10]],

       [[17, 16, 18, 20, 15],
        [14, 13, 15, 17, 12],
        [13, 12, 14, 16, 11]],

       [[18, 17, 19, 21, 16],
        [15, 14, 16, 18, 13],
        [14, 13, 15, 17, 12]]])

np.add.identity

0

NumPy應用案例¶

文件读写 np.savetxt("filename.txt", 数组)¶

单位矩阵 np.eye()¶

import numpy as np
i2=np.eye(2)
i2

array([[1., 0.],
       [0., 1.]])

np.savetxt("eye.txt",i2)

csv文件（Comma-Separated Value，逗号分隔值）¶

loadtxt() 函数¶

c,v=np.loadtxt('data.csv', delimiter=',', usecols=(6,7), unpack=True)
print(c)
print('-'*100)
print(v)

[336.1  339.32 345.03 344.32 343.44 346.5  351.88 355.2  358.16 354.54
 356.85 359.18 359.9  363.13 358.3  350.56 338.61 342.62 342.88 348.16
 353.21 349.31 352.12 359.56 360.   355.36 355.76 352.47 346.67 351.99]
----------------------------------------------------------------------------------------------------
[21144800. 13473000. 15236800.  9242600. 14064100. 11494200. 17322100.
 13608500. 17240800. 33162400. 13127500. 11086200. 10149000. 17184100.
 18949000. 29144500. 31162200. 23994700. 17853500. 13572000. 14395400.
 16290300. 21521000. 17885200. 16188000. 19504300. 12718000. 16192700.
 18138800. 16824200.]

成交量加权平均价格 $np.average(c, weights=t)$¶

算术平均值函数 $np.mean(c)$¶

import numpy as np
c,v=np.loadtxt('data.csv',delimiter=',',usecols=(6,7),unpack=True)  # c:收盘价, v:成交量
vwap=np.average(c,weights=v)
print(c)
print('-'*100)
print('VWAP=',vwap)
print('mean=', np.mean(c))

[336.1  339.32 345.03 344.32 343.44 346.5  351.88 355.2  358.16 354.54
 356.85 359.18 359.9  363.13 358.3  350.56 338.61 342.62 342.88 348.16
 353.21 349.31 352.12 359.56 360.   355.36 355.76 352.47 346.67 351.99]
----------------------------------------------------------------------------------------------------
VWAP= 350.5895493532009
mean= 351.0376666666667

时间加权平均价格 $TWAP（Time-Weighted\ Average\ Price）$¶

t=np.arange(len(c))
print(t)
print(len(c))
print(c)
print('-'*100)
print('TWAP=', np.average(c, weights=t))

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29]
30
[336.1  339.32 345.03 344.32 343.44 346.5  351.88 355.2  358.16 354.54
 356.85 359.18 359.9  363.13 358.3  350.56 338.61 342.62 342.88 348.16
 353.21 349.31 352.12 359.56 360.   355.36 355.76 352.47 346.67 351.99]
----------------------------------------------------------------------------------------------------
TWAP= 352.4283218390804

最大值 $np.max()$¶

最小值 $np.min()$¶

取值范围 $np.ptp()$¶

h,l=np.loadtxt('data.csv', delimiter=',', usecols=(4,5), unpack=True)
print('highest=', np.max(h))
print('lowest=', np.min(l))
hl=np.array([np.max(h),np.min(l)])
print(hl)
print('mean=', np.mean(hl))

highest= 364.9
lowest= 333.53
[364.9  333.53]
mean= 349.215

print('spread high price', np.ptp(h)) # 数组元素的最大值和最小值之间的差值
print('spread low price', np.ptp(l))

spread high price 24.859999999999957
spread low price 26.970000000000027

中位数 $np.median(c)$¶

print('median=', np.median(c))

median= 352.055

排序 $np.msort(c)$¶

sorted_close = np.msort(c)
print('sorted=', sorted_close)
sorted_close[15]

sorted= [336.1  338.61 339.32 342.62 342.88 343.44 344.32 345.03 346.5  346.67
 348.16 349.31 350.56 351.88 351.99 352.12 352.47 353.21 354.54 355.2
 355.36 355.76 356.85 358.16 358.3  359.18 359.56 359.9  360.   363.13]

352.12

N=len(c)
print(N)
n=int(N/2)
n1=int((N-1)/2)
print(n)
print(n1)
print('middle=', sorted_close[n])
print('middle=', sorted_close[n1])
n2=np.array([sorted_close[n], sorted_close[n1]])
print('median=', np.mean(n2))

30
15
14
middle= 352.12
middle= 351.99
median= 352.055

方差 $np.var(c)$¶

方差，是指各个数据与所有数据算术平均数的离差平方和除以数据个数，所得到的值
对方差开平方根，得到均方差，又叫标准差

c

array([336.1 , 339.32, 345.03, 344.32, 343.44, 346.5 , 351.88, 355.2 ,
       358.16, 354.54, 356.85, 359.18, 359.9 , 363.13, 358.3 , 350.56,
       338.61, 342.62, 342.88, 348.16, 353.21, 349.31, 352.12, 359.56,
       360.  , 355.36, 355.76, 352.47, 346.67, 351.99])

print('variance=', np.var(c))

variance= 50.126517888888884

np.std(c) # 标准差

7.080008325481608

np.std(c)*np.std(c) # 验证 标准差的平方 = 方差

50.126517888888884

print("variance from definition =", np.mean((c - c.mean())**2))

variance from definition = 50.126517888888884

print("variance from definition =", np.mean((c - np.mean(c))**2))

variance from definition = 50.126517888888884

print(c.mean())
print(np.mean(c))

351.0376666666667
351.0376666666667

股票收益率¶

简单收益率是指相邻两个价格之间的变化率
对数收益率是指所有价格取对数后两两之间的差值
- “$a$”的对数减去“$b$”的对数就等于“$a$除以$b$”的对数
- 因此 对数收益率 也可以用来衡量价格的变化率
收益率是一个比值
投资者最感兴趣的是收益率的方差或 标准差 ，因为这代表着投资风险的大小

$np.diff()$¶

returns=np.diff(c)/c[:-1]  # 返回一个由相邻数组元素的差值构成的数组
# 股票收益率=收益额/原始投资额
print(len(np.diff(c)), 'ok', np.diff(c))
print('-'*100)
print(len(c), 'ok', c)
print('-'*100)
print(len(c[:-1]), 'ok', c[:-1])
print('-'*100)
print(returns)
print('-'*100)
print("Standard deviation =", np.std(returns))

29 ok [  3.22   5.71  -0.71  -0.88   3.06   5.38   3.32   2.96  -3.62   2.31
   2.33   0.72   3.23  -4.83  -7.74 -11.95   4.01   0.26   5.28   5.05
  -3.9    2.81   7.44   0.44  -4.64   0.4   -3.29  -5.8    5.32]
----------------------------------------------------------------------------------------------------
30 ok [336.1  339.32 345.03 344.32 343.44 346.5  351.88 355.2  358.16 354.54
 356.85 359.18 359.9  363.13 358.3  350.56 338.61 342.62 342.88 348.16
 353.21 349.31 352.12 359.56 360.   355.36 355.76 352.47 346.67 351.99]
----------------------------------------------------------------------------------------------------
29 ok [336.1  339.32 345.03 344.32 343.44 346.5  351.88 355.2  358.16 354.54
 356.85 359.18 359.9  363.13 358.3  350.56 338.61 342.62 342.88 348.16
 353.21 349.31 352.12 359.56 360.   355.36 355.76 352.47 346.67]
----------------------------------------------------------------------------------------------------
[ 0.00958048  0.01682777 -0.00205779 -0.00255576  0.00890985  0.0155267
  0.00943503  0.00833333 -0.01010721  0.00651548  0.00652935  0.00200457
  0.00897472 -0.01330102 -0.02160201 -0.03408832  0.01184253  0.00075886
  0.01539897  0.01450483 -0.01104159  0.00804443  0.02112916  0.00122372
 -0.01288889  0.00112562 -0.00924781 -0.0164553   0.01534601]
----------------------------------------------------------------------------------------------------
Standard deviation = 0.012922134436826306

$np.log(c)$¶

logreturns = np.diff( np.log(c) )
# 对数是对求幂的逆运算
# 如果a的x次方等于N（a>0，且a不等于1），那么数x叫做以a为底N的对数（logarithm），记作x=logaN。
# 其中，a叫做对数的底数，N叫做真数。
print(c)
print('-'*100)
print(len(np.log(c)), np.log(c))
print('-'*100)
print(len(logreturns))
logreturns

[336.1  339.32 345.03 344.32 343.44 346.5  351.88 355.2  358.16 354.54
 356.85 359.18 359.9  363.13 358.3  350.56 338.61 342.62 342.88 348.16
 353.21 349.31 352.12 359.56 360.   355.36 355.76 352.47 346.67 351.99]
----------------------------------------------------------------------------------------------------
30 [5.81740873 5.82694361 5.84363137 5.84157146 5.83901242 5.84788282
 5.86329021 5.87268101 5.88097981 5.87082117 5.87731553 5.88382366
 5.88582622 5.8947609  5.88137062 5.85953188 5.824849   5.83662196
 5.83738053 5.85266214 5.86706278 5.85595978 5.86397203 5.88488106
 5.88610403 5.87313136 5.87425635 5.86496551 5.84837332 5.86360277]
----------------------------------------------------------------------------------------------------
29

array([ 0.00953488,  0.01668775, -0.00205991, -0.00255903,  0.00887039,
        0.01540739,  0.0093908 ,  0.0082988 , -0.01015864,  0.00649435,
        0.00650813,  0.00200256,  0.00893468, -0.01339027, -0.02183875,
       -0.03468287,  0.01177296,  0.00075857,  0.01528161,  0.01440064,
       -0.011103  ,  0.00801225,  0.02090904,  0.00122297, -0.01297267,
        0.00112499, -0.00929083, -0.01659219,  0.01522945])

print(2.71828**5.81740873) # e=2.71828

336.098683220974

np.log10(100)

2.0

np.log(np.e)

1.0

np.log2(8)

3.0

$np.where()$¶

根据指定的条件, 返回所有满足条件的数组元素的 索引值

posretindices = np.where(returns > 0)
print("Indices with positive returns", posretindices)

Indices with positive returns (array([ 0,  1,  4,  5,  6,  7,  9, 10, 11, 12, 16, 17, 18, 19, 21, 22, 23,
       25, 28]),)

使用 $std$ 和 $mean$ 函数计算波动率 $volatility$¶

在投资学中，波动率（$volatility$）是对价格变动的一种度量。
历史波动率可以根据历史价格数据计算得出。
计算历史波动率（如年波动率或月波动率）时，需要用到对数收益率。
假设价格日序列为：$p_1，p_2，，，，p_n$
对数收益率 ：$R_i = log(\dfrac{p_i}{p_{i-1}})=log(p_i)-log(p_{i-1})$
对数收益率的标准差 ：$\sigma = \sqrt{\dfrac{1}{N-1} \sum{(R_i-\bar{R})^2}}$
- 其中 $\bar{R}=\dfrac{1}{N} \sum{}_{i=1}^n R_i$
  - $\ $
年波动率 $=$ $\dfrac{\sigma}{\bar{R}} \times \sqrt{252}$ 通常交易日取$252$天

根据波动率指标公式计算获得波动率 $Volatility\ Index，\ VIX$；

如果所选计算周期为日，$年化 VIX = VIX \times \sqrt{250}$

如果所选计算周期为周，$年化 VIX = VIX \times \sqrt{52}$

如果所选计算周期为月，$年化 VIX = VIX \times \sqrt{12}$

如果所选计算周期为季，$年化 VIX = VIX \times 2$

如果所选计算周期为年，$年化 VIX = VIX \times 1$

简单收益率标准差是相邻价格的差值除以前一个价格, 得到标准差
年波动率=（对数收益率标准差/均值）/（交易日倒数平方根)
为什么要使用 对数收益率标准差 除以均值 ?
变异系数（$Coefficient\ of\ Variation$）
- 当需要比较两组数据离散程度大小的时候，如果两组数据的测量尺度相差太大，或者数据量纲的不同，直接使用标准差来进行比较不合适
- 此时就应当消除测量尺度和量纲的影响，而变异系数可以做到这一点，它是原始数据标准差与原始数据平均数的比。

annual_volatility = np.std(logreturns)/np.mean(logreturns)  # 年波动率等于对数收益率的标准差除以其均值
print(logreturns)
print('-'*100)
print(np.std(logreturns))
print(np.mean(logreturns))
print('-'*100)
print(annual_volatility)
annual_volatility = annual_volatility / np.sqrt(1./252.) # 注意 sqrt 函数中的除法运算。
# 在Python中，整数的除法和浮点数的除法运算机制不同，
# 必须使用浮点数才能得到正确的结果。
print(annual_volatility)

[ 0.00953488  0.01668775 -0.00205991 -0.00255903  0.00887039  0.01540739
  0.0093908   0.0082988  -0.01015864  0.00649435  0.00650813  0.00200256
  0.00893468 -0.01339027 -0.02183875 -0.03468287  0.01177296  0.00075857
  0.01528161  0.01440064 -0.011103    0.00801225  0.02090904  0.00122297
 -0.01297267  0.00112499 -0.00929083 -0.01659219  0.01522945]
----------------------------------------------------------------------------------------------------
0.012971835641060714
0.0015928976335373003
----------------------------------------------------------------------------------------------------
8.14354630702448
129.27478991115132

print(logreturns)
print(np.std(logreturns))
print(np.mean(logreturns))

[ 0.00953488  0.01668775 -0.00205991 -0.00255903  0.00887039  0.01540739
  0.0093908   0.0082988  -0.01015864  0.00649435  0.00650813  0.00200256
  0.00893468 -0.01339027 -0.02183875 -0.03468287  0.01177296  0.00075857
  0.01528161  0.01440064 -0.011103    0.00801225  0.02090904  0.00122297
 -0.01297267  0.00112499 -0.00929083 -0.01659219  0.01522945]
0.012971835641060714
0.0015928976335373003

$np.sqrt(12)$ 平方根¶

np.sqrt(12)

3.4641016151377544

日期分析¶

from datetime import datetime
def datestr2num(s):
    return datetime.strptime(s.decode('ascii'), "%d-%m-%Y").date().weekday()
#dates, close=np.loadtxt('data.csv', delimiter=',', usecols=(1,6), unpack=True, converters={1:datestr2num})
dates, close=np.loadtxt('data.csv', delimiter=',', usecols=(1,6), converters={1:datestr2num}, unpack=True)
print("Dates =",dates)
print('-'*100)
print("Close =",close)

Dates = [4. 0. 1. 2. 3. 4. 0. 1. 2. 3. 4. 0. 1. 2. 3. 4. 1. 2. 3. 4. 0. 1. 2. 3.
 4. 0. 1. 2. 3. 4.]
----------------------------------------------------------------------------------------------------
Close = [336.1  339.32 345.03 344.32 343.44 346.5  351.88 355.2  358.16 354.54
 356.85 359.18 359.9  363.13 358.3  350.56 338.61 342.62 342.88 348.16
 353.21 349.31 352.12 359.56 360.   355.36 355.76 352.47 346.67 351.99]

averages = np.zeros(5)
averages

array([0., 0., 0., 0., 0.])

$np.where(dates == i)$ 根据指定条件返回所有满足条件的数组元素的索引值¶

$np.argmax(averages)$ 返回 $averages$ 数组中最大元素的索引值¶

$np.argmin(averages)$ 返回 $averages$ 数组中最小元素的索引值¶

$np.take(close, indices)$ 按照索引值从数组中取出相应的元素¶

for i in range(5):
    indices = np.where(dates == i)
    prices = np.take(close, indices)
    avg = np.mean(prices)
    print("Day", i, "prices", prices, "Average", avg)
    averages[i] = avg
top = np.max(averages)
print("Highest average", top, '|',"Top day of the week", np.argmax(averages))
bottom = np.min(averages)
print("Lowest average", bottom, '|',"Bottom day of the week", np.argmin(averages))

Day 0 prices [[339.32 351.88 359.18 353.21 355.36]] Average 351.7900000000001
Day 1 prices [[345.03 355.2  359.9  338.61 349.31 355.76]] Average 350.63500000000005
Day 2 prices [[344.32 358.16 363.13 342.62 352.12 352.47]] Average 352.1366666666666
Day 3 prices [[343.44 354.54 358.3  342.88 359.56 346.67]] Average 350.8983333333333
Day 4 prices [[336.1  346.5  356.85 350.56 348.16 360.   351.99]] Average 350.0228571428571
Highest average 352.1366666666666 | Top day of the week 2
Lowest average 350.0228571428571 | Bottom day of the week 4

周汇总¶

dates, open, high, low, close=np.loadtxt('data.csv', delimiter=',', usecols=(1, 3, 4,5, 6), converters={1: datestr2num}, unpack=True)
close = close[:16]
dates = dates[:16]
for i in range(len(list(close))):
    print(i,' | ',dates[i],' | ',open[i], ' | ', high[i], ' | ', low[i], ' | ', close[i])

0  |  4.0  |  344.17  |  344.4  |  333.53  |  336.1
1  |  0.0  |  335.8  |  340.04  |  334.3  |  339.32
2  |  1.0  |  341.3  |  345.65  |  340.98  |  345.03
3  |  2.0  |  344.45  |  345.25  |  343.55  |  344.32
4  |  3.0  |  343.8  |  344.24  |  338.55  |  343.44
5  |  4.0  |  343.61  |  346.7  |  343.51  |  346.5
6  |  0.0  |  347.89  |  353.25  |  347.64  |  351.88
7  |  1.0  |  353.68  |  355.52  |  352.15  |  355.2
8  |  2.0  |  355.19  |  359.0  |  354.87  |  358.16
9  |  3.0  |  357.39  |  360.0  |  348.0  |  354.54
10  |  4.0  |  354.75  |  357.8  |  353.54  |  356.85
11  |  0.0  |  356.79  |  359.48  |  356.71  |  359.18
12  |  1.0  |  359.19  |  359.97  |  357.55  |  359.9
13  |  2.0  |  360.8  |  364.9  |  360.5  |  363.13
14  |  3.0  |  357.1  |  360.27  |  356.52  |  358.3
15  |  4.0  |  358.21  |  359.5  |  349.52  |  350.56

first_monday = np.where(dates == 0)
print(type(first_monday), first_monday)
print(type(first_monday[0]),first_monday[0])
print(type(first_monday[0][0]),first_monday[0][0])

<class 'tuple'> (array([ 1,  6, 11]),)
<class 'numpy.ndarray'> [ 1  6 11]
<class 'numpy.int64'> 1

first_monday = np.ravel(np.where(dates == 0))[0]
print(type(np.where(dates == 0)), np.where(dates == 0))
print(type(np.ravel(np.where(dates == 0))),np.ravel(np.where(dates == 0)))
print(type(np.ravel(np.where(dates == 0))[0]),np.ravel(np.where(dates == 0))[0])
print(type(first_monday),"The first Monday index is", first_monday)

<class 'tuple'> (array([ 1,  6, 11]),)
<class 'numpy.ndarray'> [ 1  6 11]
<class 'numpy.int64'> 1
<class 'numpy.int64'> The first Monday index is 1

last_friday = np.ravel(np.where(dates == 4))[-1]
print("The last Friday index is", last_friday)

The last Friday index is 15

weeks_indices = np.arange(first_monday, last_friday + 1)
print("Weeks indices initial", weeks_indices)

Weeks indices initial [ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]

weeks_indices = np.split(weeks_indices, 3)
print("Weeks indices after split \n", weeks_indices)
print(weeks_indices[0])
print(weeks_indices[1])
print(weeks_indices[2])
print(weeks_indices[-1])

Weeks indices after split 
 [array([1, 2, 3, 4, 5]), array([ 6,  7,  8,  9, 10]), array([11, 12, 13, 14, 15])]
[1 2 3 4 5]
[ 6  7  8  9 10]
[11 12 13 14 15]
[11 12 13 14 15]

$np.apply\_along\_axis$¶

这是一个强大的函数，在指定轴上，按指定的函数进行操作；
这个函数真正的意义在于什么?
- 除了更精细化，customized的处理行和列外，
- 它对一些 不具备axis参数的函数 ，使其具备 逐行或者逐列 处理的能力 ( np.bincount() )
- 而不必逐行逐列地进行遍历。

目前我们的数组中有3个元素，分别对应于示例数据中的3个星期，
元素中的索引值对应于示例数据中的1天。
在调用apply_along_axis 时提供我们自定义的函数名 summarize ，
并指定要作用的轴或维度的编号（如取1）、目标数组以及可变数量的 summarize 函数的参数。

def summarize(a, o, h, l, c):
    monday_open = o[a[0]]                # 此处[0]对应的weeks_indices中的[0],值为1，a[0]代表第一天的开盘价 即o[1]
    week_high = np.max( np.take(h, a) ) 
    week_low = np.min( np.take(l, a) )
    friday_close = c[a[-1]]              # a[-1]=15, c[15]代表最后一天的收盘价
    return("APPL", monday_open, week_high, week_low, friday_close)
weeksummary = np.apply_along_axis(summarize, 1, weeks_indices,open, high, low, close) # 1 表示在行的方向上进行操作
print("Week summary\n", weeksummary)
print('-'*100)
for i in range(len(list(close))):
    print(i,' | ', weeks_indices[(i-1)//5][(i-1)%5],' | ', open[i],' | ', high[i],' | ', low[i],' | ', close[i])
    # 第一行的weeks_indices[(i-1)//5][(i-1)%5]是15，因为[(i-1)//5]=-1，代表最后一个元素，即索引值为2，[(i-1)%5]为4，代表组以后一个元素，即15

Week summary
 [['APPL' '335.8' '346.7' '334.3' '346.5']
 ['APPL' '347.8' '360.0' '347.6' '356.8']
 ['APPL' '356.7' '364.9' '349.5' '350.5']]
----------------------------------------------------------------------------------------------------
0  |  15  |  344.17  |  344.4  |  333.53  |  336.1
1  |  1  |  335.8  |  340.04  |  334.3  |  339.32
2  |  2  |  341.3  |  345.65  |  340.98  |  345.03
3  |  3  |  344.45  |  345.25  |  343.55  |  344.32
4  |  4  |  343.8  |  344.24  |  338.55  |  343.44
5  |  5  |  343.61  |  346.7  |  343.51  |  346.5
6  |  6  |  347.89  |  353.25  |  347.64  |  351.88
7  |  7  |  353.68  |  355.52  |  352.15  |  355.2
8  |  8  |  355.19  |  359.0  |  354.87  |  358.16
9  |  9  |  357.39  |  360.0  |  348.0  |  354.54
10  |  10  |  354.75  |  357.8  |  353.54  |  356.85
11  |  11  |  356.79  |  359.48  |  356.71  |  359.18
12  |  12  |  359.19  |  359.97  |  357.55  |  359.9
13  |  13  |  360.8  |  364.9  |  360.5  |  363.13
14  |  14  |  357.1  |  360.27  |  356.52  |  358.3
15  |  15  |  358.21  |  359.5  |  349.52  |  350.56

$np.apply\_along\_axis$ 进阶理解

b = np.array([[1,2,3], [4,5,6], [7,8,9]])

b.shape

(3, 3)

b

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

np.apply_along_axis(np.diff,0,b) # 在列方向进行差分的动作

array([[3, 3, 3],
       [3, 3, 3]])

np.apply_along_axis(np.diff,1,b) # 在行方向进行差分的动作

array([[1, 1],
       [1, 1],
       [1, 1]])

b = np.array([[8,1,7], [4,3,9], [5,2,6]])

b

array([[8, 1, 7],
       [4, 3, 9],
       [5, 2, 6]])

np.apply_along_axis(sorted, 1, b) # 在列的方向上排序。改变了排列的结构

array([[1, 7, 8],
       [3, 4, 9],
       [2, 5, 6]])

np.apply_along_axis(sorted, 0, b) # 在列的方向上排序。改变了排列的结构

array([[4, 1, 6],
       [5, 2, 7],
       [8, 3, 9]])

取商和取余操作¶

for i in range(15):
    print(i,' | ',i//5,' | ', i%5)

0  |  0  |  0
1  |  0  |  1
2  |  0  |  2
3  |  0  |  3
4  |  0  |  4
5  |  1  |  0
6  |  1  |  1
7  |  1  |  2
8  |  1  |  3
9  |  1  |  4
10  |  2  |  0
11  |  2  |  1
12  |  2  |  2
13  |  2  |  3
14  |  2  |  4

Python工具集之科学计算包NumPy基础

NumPy基本概念¶

np.ndim¶

np.shape¶

np.dtype¶

创建自定义数据类型¶

np.itemsize¶

np.size¶

np.nbytes¶

type()¶

np.data¶

NumPy的優勢¶

Python List 代碼¶

NumPy 代碼¶

NumPy数组的创建¶

np.array() 从常规Python列表或元组中创建数组¶

np.arange() 创建序列返回数组而不是列表(类似range)¶

np.zeros() 创建一个由0组成的数组¶

np.ones() 创建一个由1組成的数组¶

np.full() 創建相同值的若幹元素的數組¶

np.fill() 填充¶

np.empty() 生成的數組内容随机且取决于存储器状态¶

linspace函數 创建序列接收元素数量而不是步长作为参数¶

np.random.rand() 返回一个或一组服从“0~1”均匀分布的随机样本值¶

np.random.randint(low, high=None, size=None, dtype='l')¶

NumPy數組print呈現¶

np.set_printoptions(threshold=np.nan)¶

NumPy数组算术运算¶

數組元素條件¶

數組相減 -¶

數組開立方 ** 元素級別操作¶

* 算術運算¶

dot矩陣運算¶

+= 和 *= 自增和自乘¶

np.sum(axis=0/1)¶

np.max()¶

np.min()¶

np.cumsum(axis=0/1) 累加¶

np.around() 返回四舍五入后的值¶

np.floor() 返回不大于输入参数的最大整数¶

np.ceil() 返回输入值的上限¶

np.where() 返回满足条件的数组元素¶

NumPy通用函数¶

np.exp() e的冪次方¶

np.sqrt() 平方根¶

np.add()¶

np.cbrt() 立方根¶

NumPy索引切片迭代¶

間隔輸出 [::]¶

逆序輸出 [ : :-1]¶

np.fromfunction¶

2維數組切片¶

np.flat 数组元素迭代器¶

NumPy数组形状更改¶

np.ravel()¶

np.flatten() 與 np.ravel() 相同，但会请求内存保存结果¶

np.reshape()¶

np.T 轉置¶

np.transpose() 转置¶

np.resize() 會修改数组本身¶

NumPy数组堆叠¶

np.hstack((a,b))¶

np.vstack((a,b))¶

np.concatenate((a,b), axis=0/1)¶

np.dstack((a,b)) # 深度堆疊¶

np.column_stack( (a, b) )¶

[:, np.newaxis]¶

切片後堆疊¶

np.row_stack( (a, b) )¶

NumPy數組分割¶

np.hsplit()¶

np.vsplit()¶

np.split(axis=0/1)¶

np.dsplit()¶

NumPy复制和视图¶

简单赋值(=)不会创建数组对象或其数据的拷贝¶

np.view() 创建一个新数组对象¶

np.base¶

np.flags¶

data changes¶

linspace函數创建序列接收元素数量而不是步长作为参数¶

csv文件（Comma-Separated Value，逗号分隔值）¶