牛顿插值
牛顿插值
差商
定义:设 \(f(x)\) 在互异节点\(x_i\)处的函数值为\(f_i, i=0,1,\dots,n\),称\(f[x_i,x_j]=\frac{f_i-f_j}{x_i-x_j}\)为\(f(x)\)关于节点\(x_i,x_j\)的一阶差商,\(f[x_i,x_j,x_k]=\frac{f[x_i,x_j]-f[x_j,x_k]}{x_i-x_k}\)为\(f(x)\)关于\(x_i,x_j,x_k\)的二阶差商,以此类推k阶差商:
定义:设 \(f(x)\) 在互异节点\(x_i\)处的函数值为\(f_i, i=0,1,\dots,n\),称\(f[x_i,x_j]=\frac{f_i-f_j}{x_i-x_j}\)为\(f(x)\)关于节点\(x_i,x_j\)的一阶差商,\(f[x_i,x_j,x_k]=\frac{f[x_i,x_j]-f[x_j,x_k]}{x_i-x_k}\)为\(f(x)\)关于\(x_i,x_j,x_k\)的二阶差商,以此类推k阶差商:
https://leetcode-cn.com/problems/search-in-rotated-sorted-array/
明显的二分查找,不过不是有序数组了,而是部分有序,所以需要有判断
class Solution(object):
def search(self, nums, target):
= 0, len(nums) - 1
left, right while left <= right:
= left + (right - left) // 2
mid if nums[mid] == target:
return mid
if nums[mid] < nums[right]:#右边为升序
if nums[mid] < target <= nums[right]:
= mid + 1
left else:
= mid
right if nums[left] <= nums[mid]:#左边为升序
if nums[left] <= target < nums[mid]:
= mid
right else:
= mid + 1
left return -1
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
= pd.read_csv("./datasets/studentscores.csv")
data data.head()
Hours | Scores | |
---|---|---|
0 | 2.5 | 21 |
1 | 5.1 | 47 |
2 | 3.2 | 27 |
3 | 8.5 | 75 |
4 | 3.5 | 30 |
= data.iloc[:,:1].values
X = data.iloc[:,1].values
Y from sklearn.model_selection import train_test_split
= train_test_split(X,Y,test_size=1/4,random_state=0) X_train,X_test,Y_train,Y_test
from sklearn.linear_model import LinearRegression
= LinearRegression()
regressor = regressor.fit(X_train,Y_train) regressor
= regressor.predict(X_test) Y_pred
='red')
plt.scatter(X_train,Y_train,color='blue') plt.plot(X_train,regressor.predict(X_train),color
= 'red')
plt.scatter(X_test , Y_test, color ='blue') plt.plot(X_test , regressor.predict(X_test), color
import pandas as pd
import numpy as np
= pd.read_csv("./datasets/50_Startups.csv")
data data.head()
R&D Spend | Administration | Marketing Spend | State | Profit | |
---|---|---|---|---|---|
0 | 165349.20 | 136897.80 | 471784.10 | New York | 192261.83 |
1 | 162597.70 | 151377.59 | 443898.53 | California | 191792.06 |
2 | 153441.51 | 101145.55 | 407934.54 | Florida | 191050.39 |
3 | 144372.41 | 118671.85 | 383199.62 | New York | 182901.99 |
4 | 142107.34 | 91391.77 | 366168.42 | Florida | 166187.94 |
= data.iloc[:,:-1].values
X = data.iloc[:,-1].values Y
from sklearn.preprocessing import LabelEncoder,OneHotEncoder
= LabelEncoder()
labelEncoder 3] = labelEncoder.fit_transform(X[:,3])
X[:,= OneHotEncoder()
onehotencoder = onehotencoder.fit_transform(X).toarray() X
= X[:,1:] X
from sklearn.model_selection import train_test_split
= train_test_split(X, Y, test_size = 0.2, random_state = 0) X_train, X_test, Y_train, Y_test
from sklearn.linear_model import LinearRegression
= LinearRegression()
regressor regressor.fit(X_train,Y_train)
LinearRegression()
学习曲线能判定偏差和方差问题
from sklearn.model_selection import train_test_split,learning_curve
import numpy as np
from sklearn.svm import SVC
from sklearn.datasets import load_digits
import matplotlib.pyplot as plt
= load_digits()
digits = digits.data
X = digits.target Y
= learning_curve(SVC(gamma=0.001),X,Y,cv=10,
train_sizes,train_loss,test_loss ='neg_mean_squared_error',
scoring=[0.1,0.25,0.5,0.75,1]) train_sizes
train_sizes
array([ 161, 404, 808, 1212, 1617])
train_loss
array([[-0. , -0.09937888, -0.09937888, -0.09937888, -0.09937888,
-0.09937888, -0.09937888, -0.09937888, -0.09937888, -0.09937888],
[-0. , -0.03960396, -0.03960396, -0.03960396, -0.03960396,
-0.03960396, -0.03960396, -0.03960396, -0.03960396, -0.03960396],
[-0. , -0.01980198, -0.01980198, -0.06435644, -0.01980198,
-0.01980198, -0.01980198, -0.01980198, -0.01980198, -0.01980198],
[-0. , -0.01650165, -0.01320132, -0.01320132, -0.01320132,
-0.01320132, -0.01320132, -0.01320132, -0.01320132, -0.01320132],
[-0.02226345, -0.03215832, -0.00989487, -0.03215832, -0.03215832,
-0.03215832, -0.03215832, -0.03215832, -0.03215832, -0.00989487]])
test_loss
array([[-1.26666667e+00, -1.43333333e+00, -3.96666667e+00,
-9.73888889e+00, -6.95000000e+00, -5.24444444e+00,
-3.02777778e+00, -5.25139665e+00, -3.48044693e+00,
-4.85474860e+00],
[-1.81111111e+00, -1.13333333e+00, -1.35555556e+00,
-3.06666667e+00, -2.08333333e+00, -2.85000000e+00,
-8.38888889e-01, -1.94413408e+00, -5.41899441e-01,
-1.35195531e+00],
[-1.71111111e+00, -3.61111111e-01, -5.11111111e-01,
-9.61111111e-01, -6.16666667e-01, -5.88888889e-01,
-1.22222222e-01, -9.16201117e-01, -7.76536313e-01,
-1.14525140e+00],
[-1.22222222e+00, -3.61111111e-01, -4.44444444e-01,
-7.00000000e-01, -5.55555556e-01, -2.66666667e-01,
-8.88888889e-02, -1.11731844e-02, -9.21787709e-01,
-8.43575419e-01],
[-9.33333333e-01, -0.00000000e+00, -2.66666667e-01,
-2.83333333e-01, -2.77777778e-01, -3.61111111e-01,
-8.88888889e-02, -5.58659218e-03, -9.21787709e-01,
-4.18994413e-01]])
= -np.mean(train_loss,axis=1)
train_mean = -np.mean(test_loss,axis=1) test_mean
train_mean
array([0.08944099, 0.03564356, 0.02227723, 0.01221122, 0.02671614])
="Training")
plt.plot(train_sizes,train_mean,label="Cross-validation")
plt.plot(train_sizes,test_mean,label
plt.legend() plt.show()
灰色预测模型(Gray Forecast Model)是通过少量的、不完全的信息,建立数学模型并做出预测的一种预测方法。是处理小样本(4个就可以)预测问题的有效工具,而对于小样本预测问题回归和神经网络的效果都不太理想。
推荐网站:http://joyfulpandas.datawhale.club/Content/Preface.html
pandas核心操作手册:https://mp.weixin.qq.com/s/l1V5e726XixI0W3EDHx0Nw
可以说merge包含了join操作,merge支持两个df间行方向或列方向的拼接操作,默认列拼接,取交集,而join只是简化了merge的行拼接的操作 pandas的merge方法提供了一种类似于SQL的内存链接操作,官网文档提到它的性能会比其他开源语言的数据操作(例如R)要高效。 如果对于sql比较熟悉的话,merge也比较好理解。 merge的参数
线性判别分析,也就是LDA(与主题模型中的LDA区分开),现在常常用于数据的降维中,但从它的名字中可以看出来它也是一个分类的算法,而且属于硬分类,也就是结果不是概率,是具体的类别 ## 主要思想 1. 类内方差小 2. 类间方差大 ## 推导 这里以二类为例,即只有两个类别。
\(P(B|A) = \frac{P(AB)}{P(A)}\)
如果P(A) > 0 \(P(AB) = P(A)P(B|A)\) 如果\(P(A_1 \dots A_{n-1})\) > 0 则
\[ \begin{aligned} P(A_1A_2\dots A_n) = P(A_1A_2\dots A_{n-1})P(A_n | A_1A_2\dots A_{n-1}) \\\\ = P(A_1)P(A_2|A_1)P(A_3|A_1A_2)\dots P(A_n|A_1A_2\dots A_{n-1}) \end{aligned} \]
其中第一步使用了乘法公式,然后再对前者继续使用乘法公式,以此类推,就可以得到最后的结果。
正为逆时针转,负为顺时针转。
import numpy as np
mat = np.array([[1,3,5],
[2,4,6],
[7,8,9]
])
print mat, "# orignal"
mat90 = np.rot90(mat, 1)
print mat90, "# rorate 90 <left> anti-clockwise"
mat90 = np.rot90(mat, -1)
print mat90, "# rorate 90 <right> clockwise"
mat180 = np.rot90(mat, 2)
print mat180, "# rorate 180 <left> anti-clockwise"
mat270 = np.rot90(mat, 3)
print mat270, "# rorate 270 <left> anti-clockwise"
直接复制的代码,python2,能看懂就行。