一 前言

每当选课的时候,都如同打仗一般 
都有自己想要的课,但是名额就那么一点 
于是各显神通,有人用js,有人用chrome的console 
人生苦短,我用Python

二 环境依赖

  • Python 2.7.12
  • (NEW) Python 3.3 & Python 3.6
  • pip freeze > Requirement.txt

Requirement.txt

beautifulsoup4==4.6.0
bs4==0.0.1
configparser==3.5.0
lxml==3.7.3
requests==2.13.0
tqdm==4.11.2

三 使用方法

获取程序

你可以直接git clone最新版本的程序

$ git clone https://github.com/okcd00/CDSelector.git
$ cd CDSelector
$ vim config 修改登陆信息
$ vim courseid 修改选课信息
$ python CDSelector

你也可以去release下载当前最新的稳定版Release 
https://github.com/okcd00/CDSelector/releases

修改文件 config

[info]
username = [你的帐户名邮箱]
password = [你的密码]
runtime  = [每隔多少秒尝试选课一次][action]
debug = true [debug模式输出中间变量,为节省资源可设置为false]
enroll = true [轮询模式下无限循环尝试,没想过什么情况下需要设为false]
evaluate = true [验证选课成功与否,建议开启]
select_bat = false [批选课,应用于类似英语B这种不让单独选,必须同时选俩才给提交表单的特殊情况]

修改文件 courseid

类似如下的每行一门课的课程编号即可,

091M7014H
091M7021H

特别的,如果这门课要选成学位课的话,

091M7014H
091M7021H on

形似第二行加个空格加个on即可

然后运行 CDSelector.py

$ python CDSelector.py
Debug Mode: True
Login success
Enrolling start
> Course Selection is unreachable or not started. <1134> Thu Jun 01 08:43:42 2017

如果显示ImportError,就是说缺少了某些python包,在这里提供了requirement.txt 
可以在当前目录下直接

$ pip install -r requirement.txt

一次性安装所有依赖项

四 Source Code

由于时不时还更新一下,所以这里只是v1.0.0首发版本 
web端访问部分参考了scusjs,功能强化参考了zoecur

(Updated 2017-09-07) 现在更新一波,v1.0.7版本。

#coding=utf8
# =====================================================
#   Copyright (C) 2016 All rights reserved.
#
#   filename : CDSelector.py
#   author   : okcd00 / okcd00@qq.com
#   refer    : scusjs@foxmail.com
#   date     : 2017-01-06
#   desc     : UCAS Course_Selection Program
# =====================================================import os
import sys
import time
import requests
from bs4 import BeautifulSoup
from configparser import RawConfigParserclass UCASEvaluate:def __init__(self):self.__readCoursesId('./courseid')cf = RawConfigParser()cf.read('config')self.username = cf.get('info', 'username')self.password = cf.get('info', 'password')self.runtime = cf.getint('info', 'runtime')self.debug = cf.getboolean('action', 'debug')self.enroll = cf.getboolean('action', 'enroll')self.evaluate = cf.getboolean('action', 'evaluate')self.select_bat = cf.getboolean('action', 'select_bat')self.loginPage = 'http://sep.ucas.ac.cn'self.loginUrl = self.loginPage + '/slogin'self.courseSystem = self.loginPage + '/portal/site/226/821'self.courseBase = 'http://jwxk.ucas.ac.cn'self.courseIdentify = self.courseBase + '/login?Identity='self.courseSelected = self.courseBase + '/courseManage/selectedCourse'self.courseSelectionBase = self.courseBase + '/courseManage/main'self.courseCategory = self.courseBase + '/courseManage/selectCourse?s='self.courseSave = self.courseBase + '/courseManage/saveCourse?s='self.studentCourseEvaluateUrl = 'http://jwjz.ucas.ac.cn/Student/DeskTopModules/'self.selectCourseUrl = 'http://jwjz.ucas.ac.cn/Student/DesktopModules/Course/SelectCourse.aspx'self.enrollCount = {}self.headers = {'Host': 'sep.ucas.ac.cn','Connection': 'keep-alive','Pragma': 'no-cache','Cache-Control': 'no-cache','Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8','Upgrade-Insecure-Requests': '1','User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.80 Safari/537.36','Accept-Encoding': 'gzip, deflate, sdch','Accept-Language': 'zh-CN,zh;q=0.8,en;q=0.6',}self.s = requests.Session()loginPage = self.s.get(self.loginPage, headers=self.headers)self.cookies = loginPage.cookiesdef login(self):postdata = {'userName': self.username,'pwd': self.password,'sb': 'sb'}self.s.post(self.loginUrl, data=postdata, headers=self.headers)if 'sepuser' in self.s.cookies.get_dict(): return Truereturn Falsedef getMessage(self, restext):css_soup = BeautifulSoup(restext, 'html.parser')text = css_soup.select('#main-content > div > div.m-cbox.m-lgray > div.mc-body > div')[0].textreturn "".join(line.strip() for line in text.split('\n'))def __readCoursesId(self, filename):coursesFile = open(filename, 'r')self.coursesId = {}for line in coursesFile.readlines():line = line.strip().replace(' ', '').split(':')courseId = line[0]isDegree = Falseif len(line) == 2 and line[1] == 'on':isDegree = Trueself.coursesId[courseId] = isDegreedef enrollCourses(self):response = self.s.get(self.courseSystem, headers=self.headers)soup = BeautifulSoup(response.text, 'html.parser')try:identity = str(soup).split('Identity=')[1].split('"'[0])[0]coursePage = self.courseIdentify + identityresponse = self.s.get(coursePage)response = self.s.get(self.courseSelected)idx, lastMsg = 0, ""while True:msg = ""if self.select_bat: result, msg = self.__enrollCourses(self.coursesId)if result: self.coursesId.clear()else:   for eachCourse in self.coursesId:if eachCourse in response.text:print("Course " + eachCourse + " has been selected.")continueif (eachCourse in self.enrollCount andself.enrollCount[eachCourse] == 0):continueself.enrollCount[eachCourse] = 1result, msg = self.__enrollCourse(eachCourse, self.coursesId[eachCourse])if result:self.enrollCount[eachCourse] = 0for enroll in self.enrollCount:if self.enrollCount[enroll] == 0:self.coursesId.pop(enroll)self.enrollCount.clear()if not self.coursesId: return 'INVALID COURSES_ID'idx += 1time.sleep(self.runtime)showText = "\r> " + "%s <%d> %s" % (msg, idx,time.asctime( time.localtime(time.time()) ))lastMsg = msgsys.stdout.write(showText)sys.stdout.flush()except KeyboardInterrupt:print("\nKeyboardInterrupt Detected, bye!")return "STOP"except Exception as exception:return "Course_Selection_Port is not open, waiting..."def __enrollCourse(self, courseId, isDegree):response = self.s.get(self.courseSelectionBase)if self.debug:with open('./check.html', 'wb+') as f:f.write(response.text.encode('utf-8'))soup = BeautifulSoup(response.text, 'html.parser')categories = dict([(label.contents[0][:2], label['for'][3:])for label in soup.find_all('label')[2:]])categoryId = categories[courseId[:2]]identity = soup.form['action'].split('=')[1]postdata = {'deptIds': categoryId,'sb': 0}categoryUrl = self.courseCategory + identityresponse = self.s.post(categoryUrl, data=postdata)if self.debug:print ("Now Posting, save snapshot in check2.html.")with open('./check2.html', 'wb+') as f:f.write(response.text.encode('utf-8'))soup = BeautifulSoup(response.text, 'html.parser')courseTable = soup.body.form.tableif courseTable:courseTable = courseTable.find_all('tr')[1:]else: return False, "Course Selection is unreachable or not started."courseDict = dict([(c.span.contents[0], c.span['id'].split('_')[1])for c in courseTable])if courseId in courseDict:postdata = {'deptIds': categoryId,'sids': courseDict[courseId]}if isDegree:postdata['did_' + courseDict[courseId]] = courseDict[courseId]courseSaveUrl = self.courseSave + identityresponse = self.s.post(courseSaveUrl, data=postdata)print ("Now Checking, save snapshot in result.html.")with open('result.html','wb+') as f:f.write(response.text.encode('utf-8'))if 'class="error' not in response.text:return True, '[Success] ' + courseIdelse: return False, self.getMessage(response.text).strip()else:return False, "No such course"def __enrollCourses(self, courseIds):  # For Englishresponse = self.s.get(self.courseSelectionBase)if self.debug: with open('./check.html', 'wb+') as f:f.write(response.text.encode('utf-8'))soup = BeautifulSoup(response.text, 'html.parser')categories = dict([(label.contents[0][:2], label['for'][3:])for label in soup.find_all('label')[2:]])identity = soup.form['action'].split('=')[1]categoryIds = []for courseId in courseIds:categoryIds.append(categories[courseId[:2]])postdata = {'deptIds': categoryIds,'sb': 0}categoryUrl = self.courseCategory + identityresponse = self.s.post(categoryUrl, data=postdata)if self.debug: print ("Now Posting, save snapshot in check2.html.")with open('./check2.html', 'wb+') as f:f.write(response.text.encode('utf-8'))soup = BeautifulSoup(response.text, 'html.parser')courseTable = soup.body.form.tableif courseTable:courseTable = courseTable.find_all('tr')[1:]else: return False, "Course Selection is unreachable or not started."courseDict = dict([(c.span.contents[0], c.span['id'].split('_')[1])for c in courseTable])postdata = {'deptIds': categoryIds,'sids': [courseDict[courseId] for courseId in courseIds]}courseSaveUrl = self.courseSave + identityresponse = self.s.post(courseSaveUrl, data=postdata)print ("Now Checking, save snapshot in result.html.")with open('result.html','wb+') as f:f.write(response.text.encode('utf-8'))if 'class="error' not in response.text:return True, '[Success] ' + courseIdelse: return False, self.getMessage(response.text).strip()if __name__ == "__main__":print("starting...")os.system('MODE con: COLS=128 LINES=32 & TITLE Welcome to CDSelector')from logo import show_logoshow_logo() # delete this for faster start 23333os.system('cls')time.sleep(1)os.system("color 0A")os.system('MODE con: COLS=80 LINES=10 & TITLE CD_Course_Selecting is working')while True:try:ucasEvaluate = UCASEvaluate()breakexcept Exception as e:if e[0]=="Connection aborted.":ucasEvaluate = UCASEvaluate()if ucasEvaluate.debug:print ("Debug Mode: %s" % str(ucasEvaluate.debug) )print ("In debug mode, you can check snapshot with html files.")print ("By the way, Ctrl+C to stop.")if not ucasEvaluate.login():print('Login error. Please check your username and password.')exit()print('Login success: ' + ucasEvaluate.username)print('Enrolling starts')while ucasEvaluate.enroll:status = ucasEvaluate.enrollCourses()if status == 'STOP':breakelse: status += time.asctime( time.localtime(time.time()) )sys.stdout.write("%s\r" % status)print('Enrolling finished')

五 获取途径

  • Github: https://github.com/okcd00/CDSelector
  • Release: https://github.com/okcd00/CDSelector/releases
  • 说明文档: http://blog.csdn.net/okcd00/article/details/72827861

【抢课】用Python网页爬虫来进行选(qiang)课相关推荐

  1. 【选课脚本】用Python网页爬虫来进行选(qiang)课 (更新至v1.0.8)

    0x00 前言 每当选课的时候,都如同打仗一般 都有自己想要的课,但是名额就那么一点 于是各显神通,有人用 js,有人用 chrome 的 console 人生苦短,我用Python (Last Up ...

  2. python网页爬虫-python网页爬虫浅析

    Python网页爬虫简介: 有时候我们需要把一个网页的图片copy 下来.通常手工的方式是鼠标右键 save picture as ... python 网页爬虫可以一次性把所有图片copy 下来. ...

  3. python网页爬虫-Python网页爬虫

    曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然开发语言是C/C++,但平时的很多文本数据处理任务都交给了Python.离开腾讯创业后,第一个作品课程图谱也是选 ...

  4. Python 网页爬虫 文本处理 科学计算 机器学习 数据挖掘兵器谱 - 数客

    曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然开发语言是C/C++,但平时的很多文本数据处理任务都交给了Python.离开腾讯创业后,第一个作品课程图谱也是选 ...

  5. Python 网页爬虫 文本处理 科学计算 机器学习 数据挖掘兵器谱

    Python 网页爬虫 & 文本处理 & 科学计算 & 机器学习 & 数据挖掘兵器谱 2015-04-27 程序猿 程序猿 来自:我爱自然语言处理,www.52nlp. ...

  6. python网页爬虫+简单的数据分析

    python网页爬虫+简单的数据分析 文章目录 python网页爬虫+简单的数据分析 一.数据爬取 二.数据分析 1.我们今天爬取的目标网站是:http://pm25.in/ 2.需要爬取的目标数据是 ...

  7. python 网页爬虫作业调度_第3次作业-MOOC学习笔记:Python网络爬虫与信息提取

    1.注册中国大学MOOC 2.选择北京理工大学嵩天老师的<Python网络爬虫与信息提取>MOOC课程 3.学习完成第0周至第4周的课程内容,并完成各周作业. 4.提供图片或网站显示的学习 ...

  8. Python网页爬虫--

    pycharm里安装beautifulSoup以及lxml,才能使爬虫功能强大. 做网页爬虫需要,<网页解析器:从网页中提取有价值数据的工具 http://blog.csdn.net/ochan ...

  9. python 网页爬虫nike_python网络爬虫-爬取网页的三种方式(1)

    0.前言 0.1 抓取网页 本文将举例说明抓取网页数据的三种方式:正则表达式.BeautifulSoup.lxml. 获取网页内容所用代码详情请参照Python网络爬虫-你的第一个爬虫.利用该代码获取 ...

最新文章

  1. centos7 virtualbox使用internal network 内网模式
  2. SkyWalking学习笔记(CentOS环境)
  3. oracle unpivot 空值,sql – 处理UNPIVOT中的NULL值
  4. C#中如何实现控件数组
  5. 如何配置Smarty模板
  6. python爬虫实现方式_python爬虫的实现方法
  7. 测试用例管理工具_检测Bug很难吗?推荐优质的测试管理工具
  8. StringRedisTemplate和RedisTemplate区别和联
  9. 现代语音信号处理之语音信号的非线性分析
  10. Quartus II 软件使用(零)---安装与破解 (9.0版本 亲测有效)
  11. GridView和DataFormatString
  12. Linux笔记 No.22---(Linux - PAM)
  13. oracle数据库中求某行的上一条记录和下一条记录
  14. npm ERR! code ERESOLVEnpm ERR! ERESOLVE could not resolve dependency
  15. 基于MATLAB的声信号的采集与分析,基于Matlab的声音信号采集与分析处理
  16. 计算机二级题 张东明论文修改,计算机二级第十四套word题目要求
  17. 自动设置IP地址的BAT
  18. CSS参考手册_web前端开发参考手册系列
  19. Javaweb之JSTL
  20. SVN代码正确提交方法!

热门文章

  1. 阀盖零件/汽车连杆/发动机连杆/左支座/后钢板弹簧吊耳/法兰盘/拨叉/轴承座/后托架/齿轮油泵泵体/手柄座/杠杆/连接座/手柄套/十字头零件/活塞……加工工艺及夹具毕业设计、课程设计题目推荐
  2. git stash用法总结
  3. 64位和32位各个数据类型大小
  4. APP推广:11 种最实用的线下推广方式!
  5. 文献学习-摘抄 基于动力学一致性的高速公路卡车编队控制研究
  6. vite npm 升级所有依赖包
  7. 马上步入社会了,去工作
  8. selcect 下拉框样式 -添加加下拉图标
  9. cocos2d-x 获取屏幕大小 实际设计大小 分辨率适配问题
  10. 系统崩溃后 我这样拯救我的硬盘数据