Recently, I saw that python was used to implement train ticket query. I also implemented it myself. I felt that I gained a lot. I will share each step in detail below. (Note that python3 is used)
First I will show the final result:
Execute on the cmd command line: python tickets.py -dk shanghai chengdu 20161007 > result. txt
means: Query the train information starting with D and K from Shanghai to Chengdu on October 7, 2016, and save it to the result.txt file; the following is result.txt The results in the file:
The following will be the implementation steps:
1, Install third-party libraries pip install Installation: requests, docopt, prettytable
2. docopt can be used to parse parameters entered from the command line:
""" Usage: test [-gdtkz] <from> <to> <date> Options: -h,--help 显示帮助菜单 -g 高铁 -d 动车 -t 特快 -k 快速 -z 直达 Example: tickets -gdt beijing shanghai 2016-08-25 """ import docopt args = docopt.docopt(__doc__) print(args) # 上面 """ """ 包含中的: #Usage: # test [-gdtkz] <from> <to> <date> #是必须要的 test 是可以随便写的,不影响解析
The final printed result is a dictionary for later use:
3. Obtain train information
Our interface for querying remaining votes in 12306:
url: 'https://kyfw.12306.cn/otn/resources/js/framework/station_name.js?station_version= 1.8968'
The method is: get
Transmitted parameters: queryDate:2016-10-05, from_station:CDW, to_station:SHH
The corresponding abbreviation of the city needs to be additional The interface query results in
3.1 Query the corresponding abbreviation of the city:
The url of this interface = 'https://kyfw.12306.cn/otn/resources/js/framework/station_name. js?station_version=1.8968'
The method is get, use regular expressions on the returned results, and extract the values of the city name and abbreviation (the returned values are similar: 7@cqn|Chongqing South|CRW|chongqingnan|cqn|, What we need is: CRW, chongqingnan), the code is as follows
parse_stations.py:
#coding=utf-8 from prettytable import PrettyTable class TrainCollection(object): """ 解析列车信息 """ # 显示车次、出发/到达站、 出发/到达时间、历时、一等坐、二等坐、软卧、硬卧、硬座 header = '序号 车次 出发站/到达站 出发时间/到达时间 历时 商务座 一等座 二等座 软卧 硬卧 硬座 无座'.split() def __init__(self,rows,traintypes): self.rows = rows self.traintypes = traintypes def _get_duration(self,row): """ 获取车次运行的时间 """ duration = row.get('lishi').replace(':','小时') + '分' if duration.startswith('00'): return duration[4:] elif duration.startswith('0'): return duration[1:] return duration @property def trains(self): result = [] flag = 0 for row in self.rows: if row['station_train_code'][0] in self.traintypes: flag += 1 train = [ # 序号 flag, # 车次 row['station_train_code'], # 出发、到达站点 '/'.join([row['from_station_name'],row['to_station_name']]), # 成功、到达时间 '/'.join([row['start_time'],row['arrive_time']]), # duration 时间 self._get_duration(row), # 商务座 row['swz_num'], # 一等座 row['zy_num'], # 二等座 row['ze_num'], # 软卧 row['rw_num'], # 硬卧 row['yw_num'], # 硬座 row['yz_num'], # 无座 row['wz_num'] ] result.append(train) return result def print_pretty(self): """打印列车信息""" pt = PrettyTable() pt._set_field_names(self.header) for train in self.trains: pt.add_row(train) print(pt) if __name__ == '__main__': t = TrainCollection()
The pprint module can be printed Information, easier to read:
Run in cmd: python parse_stations.py > stations.py
will get the stations.py file in the current directory. The file contains the site name and For short, add "stations = " to the stations.py file to create a dictionary to facilitate later values. Here is the content of the stations.py file:
3.2 Now The parameters for obtaining train information have been prepared. The next step is to get the return value of the train and parse out the information you need, such as train number, first-class ticket number, etc. . , myprettytable.py
#coding=utf-8 from prettytable import PrettyTable class TrainCollection(object): """ 解析列车信息 """ # 显示车次、出发/到达站、 出发/到达时间、历时、一等坐、二等坐、软卧、硬卧、硬座 header = '序号 车次 出发站/到达站 出发时间/到达时间 历时 商务座 一等座 二等座 软卧 硬卧 硬座 无座'.split() def __init__(self,rows,traintypes): self.rows = rows self.traintypes = traintypes def _get_duration(self,row): """ 获取车次运行的时间 """ duration = row.get('lishi').replace(':','小时') + '分' if duration.startswith('00'): return duration[4:] elif duration.startswith('0'): return duration[1:] return duration @property def trains(self): result = [] flag = 0 for row in self.rows: if row['station_train_code'][0] in self.traintypes: flag += 1 train = [ # 序号 flag, # 车次 row['station_train_code'], # 出发、到达站点 '/'.join([row['from_station_name'],row['to_station_name']]), # 成功、到达时间 '/'.join([row['start_time'],row['arrive_time']]), # duration 时间 self._get_duration(row), # 商务座 row['swz_num'], # 一等座 row['zy_num'], # 二等座 row['ze_num'], # 软卧 row['rw_num'], # 硬卧 row['yw_num'], # 硬座 row['yz_num'], # 无座 row['wz_num'] ] result.append(train) return result def print_pretty(self): """打印列车信息""" pt = PrettyTable() pt._set_field_names(self.header) for train in self.trains: pt.add_row(train) print(pt) if __name__ == '__main__': t = TrainCollection()
prettytable This library can print out a format similar to that displayed by mysql query data,
4. The next step is to integrate the various modules: tickets.py
"""Train tickets query via command-line. Usage: tickets [-gdtkz] <from> <to> <date> Options: -h,--help 显示帮助菜单 -g 高铁 -d 动车 -t 特快 -k 快速 -z 直达 Example: tickets -gdt beijing shanghai 2016-08-25 """ import requests from docopt import docopt from stations import stations # from pprint import pprint from myprettytable import TrainCollection class SelectTrain(object): def __init__(self): """ 获取命令行输入的参数 """ self.args = docopt(__doc__)#这个是获取命令行的所有参数,返回的是一个字典 def cli(self): """command-line interface""" # 获取 出发站点和目标站点 from_station = stations.get(self.args['<from>']) #出发站点 to_station = stations.get(self.args['<to>']) # 目的站点 leave_time = self._get_leave_time()# 出发时间 url = 'https://kyfw.12306.cn/otn/lcxxcx/query?purpose_codes=ADULT&queryDate={0}&from_station={1}&to_station={2}'.format( leave_time,from_station,to_station)# 拼接请求列车信息的Url # 获取列车查询结果 r = requests.get(url,verify=False) traindatas = r.json()['data']['datas'] # 返回的结果,转化成json格式,取出datas,方便后面解析列车信息用 # 解析列车信息 traintypes = self._get_traintype() views = TrainCollection(traindatas,traintypes) views.print_pretty() def _get_traintype(self): """ 获取列车型号,这个函数的作用是的目的是:当你输入 -g 是只是返回 高铁,输入 -gd 返回动车和高铁,当不输参数时,返回所有的列车信息 """ traintypes = ['-g','-d','-t','-k','-z'] # result = [] # for traintype in traintypes: # if self.args[traintype]: # result.append(traintype[-1].upper()) trains = [traintype[-1].upper() for traintype in traintypes if self.args[traintype]] if trains: return trains else: return ['G','D','T','K','Z'] def _get_leave_time(self): """ 获取出发时间,这个函数的作用是为了:时间可以输入两种格式:2016-10-05、20161005 """ leave_time = self.args['<date>'] if len(leave_time) == 8: return '{0}-{1}-{2}'.format(leave_time[:4],leave_time[4:6],leave_time[6:]) if '-' in leave_time: return leave_time if __name__ == '__main__': cli = SelectTrain() cli.cli()
Okay, it’s basically over. According to the beginning, you can query I want the train information
The above is the Python script introduced by the editor to implement the 12306 train ticket query system. I hope it will be helpful to everyone. If you have any questions, please leave me a message. Editor Will reply to everyone promptly. I would also like to thank you all for your support of the PHP Chinese website!
For more articles related to the implementation of 12306 train ticket query system in Python, please pay attention to the PHP Chinese website!