2 - 问题描述


  • 你想把业务扩展到可能给你的餐馆带来更高利润的城市。
  • 该连锁店已经在不同的城市设有餐厅,你有这些城市的利润和人口数据。
  • 你也有一些城市的数据,这些城市是新餐厅的候选城市。
    • 对于这些城市,你有城市人口的数据。


3 - 数据集


  • 下面显示的load_data()函数将数据加载到变量x_trainy_train中。

    • x_train是一个城市的人口
    • y_train是该城市的餐馆的利润。利润的负值表示亏损。
    • X_trainy_train都是numpy数组。
import numpy as np
import matplotlib.pyplot as plt
from utils import *
import copy
import math# %matplotlib inline   →  plt.show()# load the dataset
x_train, y_train = load_data()
# print x_train
print("Type of x_train:", type(x_train))
print("First five elements of x_train are:\n", x_train[:5])
# print y_train
print("Type of y_train:", type(y_train))
print("First five elements of y_train are:\n", y_train[:5])
# 打印x_train和y_train的形状,看看你的数据集中有多少训练实例。
print('The shape of x_train is:', x_train.shape)
print('The shape of y_train is: ', y_train.shape)
print('Number of training examples (m):', len(x_train))# 创建一个数据的散点图。要将标记改为红色的 "x",
# 我们使用了'marker'和
# 'c'参数   c就是color
plt.scatter(x_train, y_train, marker='x', c='r')# 设置标题
plt.title("Profits vs. Population per city")
# 设置y轴标签
plt.ylabel('Profit in $10,000')
# 设置x轴标签
plt.xlabel('Population of City in 10,000s')
plt.show()def compute_cost(x, y, w, b):# number of training examplesm = x.shape[0]# You need to return this variable correctlytotal_cost = 0### START CODE HERE #### Variable to keep track of sum of cost from each examplecost_sum = 0# Loop over training examplesfor i in range(m):# Your code here to get the prediction f_wb for the ith examplef_wb = w * x[i] + b# Your code here to get the cost associated with the ith examplecost = (f_wb - y[i]) ** 2# Add to sum of cost for each examplecost_sum = cost_sum + cost# Get the total cost as the sum divided by (2*m)total_cost = (1 / (2 * m)) * cost_sum### END CODE HERE ###return total_cost# UNQ_C2# GRADED FUNCTION: compute_gradientdef compute_gradient(x, y, w, b):"""Computes the gradient for linear regressionArgs:x (ndarray): Shape (m,) Input to the model (Population of cities)y (ndarray): Shape (m,) Label (Actual profits for the cities)w, b (scalar): Parameters of the modelReturnsdj_dw (scalar): The gradient of the cost w.r.t. the parameters wdj_db (scalar): The gradient of the cost w.r.t. the parameter b"""# Number of training examplesm = x.shape[0]# You need to return the following variables correctlydj_dw = 0dj_db = 0### START CODE HERE ###for i in range(m):f_wb = w * x[i] + bdj_db_i = f_wb - y[i]dj_dw_i = (f_wb - y[i]) * x[i]# Update dj_db : In Python, a += 1  is the same as a = a + 1dj_db += dj_db_i# Update dj_dwdj_dw += dj_dw_i# Divide both dj_dw and dj_db by mdj_dw = dj_dw / mdj_db = dj_db / m
#    dj_db = (w * x + b - y).sum() / m
#    dj_dw = ((w * x + b - y) * x).sum() / m### END CODE HERE ###return dj_dw, dj_dbdef gradient_descent(x, y, w_in, b_in, cost_function, gradient_function, alpha, num_iters):"""Performs batch gradient descent to learn theta. Updates theta by takingnum_iters gradient steps with learning rate alphaArgs:x :    (ndarray): Shape (m,)y :    (ndarray): Shape (m,)w_in, b_in : (scalar) Initial values of parameters of the modelcost_function: function to compute costgradient_function: function to compute the gradientalpha : (float) Learning ratenum_iters : (int) number of iterations to run gradient descentReturnsw : (ndarray): Shape (1,) Updated values of parameters of the model afterrunning gradient descentb : (scalar)                Updated value of parameter of the model afterrunning gradient descent"""# number of training examplesm = len(x)# An array to store cost J and w's at each iteration — primarily for graphing laterJ_history = []w_history = []w = copy.deepcopy(w_in)  # avoid modifying global w within functionb = b_infor i in range(num_iters):# Calculate the gradient and update the parametersdj_dw, dj_db = gradient_function(x, y, w, b)# Update Parameters using w, b, alpha and gradientw = w - alpha * dj_dwb = b - alpha * dj_db# Save cost J at each iterationif i < 100000:  # prevent resource exhaustioncost = cost_function(x, y, w, b)J_history.append(cost)# Print cost every at intervals 10 times or as many iterations if < 10if i % math.ceil(num_iters / 10) == 0:w_history.append(w)print(f"Iteration {i:4}: Cost {float(J_history[-1]):8.2f}   ")return w, b, J_history, w_history  # return w and J,w history for graphing# initialize fitting parameters. Recall that the shape of w is (n,)
initial_w = 0.
initial_b = 0.# some gradient descent settings
iterations = 1500
alpha = 0.01w,b,_,_ = gradient_descent(x_train ,y_train, initial_w, initial_b,compute_cost, compute_gradient, alpha, iterations)
print("w,b found by gradient descent:", w, b)m = x_train.shape[0]
predicted = np.zeros(m)for i in range(m):predicted[i] = w * x_train[i] + b# Plot the linear fit
plt.plot(x_train, predicted, c = "b")# Create a scatter plot of the data.
plt.scatter(x_train, y_train, marker='x', c='r')# Set the title
plt.title("Profits vs. Population per city")
# Set the y-axis label
plt.ylabel('Profit in $10,000')
# Set the x-axis label
plt.xlabel('Population of City in 10,000s')


