剛學(xué)Python做爬蟲練手時,遇到這個問題,求各位大神支招~?
code:
from bs4 import BeautifulSoup
from urllib import request
from datetime import datetime
# 抓取網(wǎng)頁!
response = request.urlopen("http://fund.eastmoney.com/fund.html")
html = response.read()
html = html.decode('gbk') # 這一步是為啥?
with open("./htmls/1.txt", 'wb') as f:
f.write(html.encode('utf8'))
f.close()
with open("./htmls/1.txt", 'rb') as f:
html = f.read().decode('utf8')
f.close()
soup = BeautifulSoup(html, "html.parser")
fCodes = soup.find("table", id="oTable").tbody.find_all("td", "bzdm") # 基金編碼
result = ()
for fCode in fCodes:
result += (
{
"fcode": fCode.get_text(),
"fname": fCode.next_sibling.find("a").get_text(),
"NAV": fCode.next_sibling.next_sibling.get_text(),
"ACCNAV": fCode.next_sibling.next_sibling.next_sibling.get_text() if fCode.next_sibling.next_sibling.next_sibling.get_text() != '---' else 0.0000,
"updatetime": datetime.now().isoformat(sep=' ', timespec="seconds")
},)
# print(result)
import pymysql
from pymysql.cursors import Cursor, SSCursor
# from common.config import dbconfig
# connection = pymysql.connect(**dbconfig)
# Connect to the database
connection = pymysql.connect(host='localhost',
user='root',
password='root',
db='ins',
charset='utf8',
cursorclass=pymysql.cursors.DictCursor)
cursor = Cursor(connection)
sql = """insert into myfund(fcode, fname,NAV,ACCNAV,updatetime)
values(%(fcode)s,%(fname)s,%(NAV)s,%(ACCNAV)s,%(updatetime)s)
ON duplicate KEY UPDATE `updatetime`=%(updatetime)s,NAV=%(NAV)s,ACCNAV=%(ACCNAV)s"""
res = cursor.executemany(sql, result)
cursor.fetchall()
connection.commit()
print(res)
connection.close()
#報錯
#
Traceback (most recent call last):
File "/Users/carl/wwwroot/TestAction/PythonActions/jtthinkPythonActions/day08_tuple/mypro/__main__db.py", line 68, in <module>
res = cursor.executemany(sql, result)
File "/Users/carl/wwwroot/TestAction/PythonActions/venvActions/lib/python3.6/site-packages/pymysql/cursors.py", line 192, in executemany
self._get_db().encoding)
File "/Users/carl/wwwroot/TestAction/PythonActions/venvActions/lib/python3.6/site-packages/pymysql/cursors.py", line 229, in _do_execute_many
rows += self.execute(sql + postfix)
File "/Users/carl/wwwroot/TestAction/PythonActions/venvActions/lib/python3.6/site-packages/pymysql/cursors.py", line 165, in execute
result = self._query(query)
File "/Users/carl/wwwroot/TestAction/PythonActions/venvActions/lib/python3.6/site-packages/pymysql/cursors.py", line 321, in _query
conn.query(q)
File "/Users/carl/wwwroot/TestAction/PythonActions/venvActions/lib/python3.6/site-packages/pymysql/connections.py", line 860, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "/Users/carl/wwwroot/TestAction/PythonActions/venvActions/lib/python3.6/site-packages/pymysql/connections.py", line 1061, in _read_query_result
result.read()
File "/Users/carl/wwwroot/TestAction/PythonActions/venvActions/lib/python3.6/site-packages/pymysql/connections.py", line 1349, in read
first_packet = self.connection._read_packet()
File "/Users/carl/wwwroot/TestAction/PythonActions/venvActions/lib/python3.6/site-packages/pymysql/connections.py", line 1018, in _read_packet
packet.check_error()
File "/Users/carl/wwwroot/TestAction/PythonActions/venvActions/lib/python3.6/site-packages/pymysql/connections.py", line 384, in check_error
err.raise_mysql_exception(self._data)
File "/Users/carl/wwwroot/TestAction/PythonActions/venvActions/lib/python3.6/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '%(updatetime)s,NAV=%(NAV)s,ACCNAV=%(ACCNAV)s' at line 3")
#數(shù)據(jù)可以插入進(jìn)去,但要刪除ON Duplicate key update,并清空表
數(shù)據(jù)庫表結(jié)構(gòu):
-- auto-generated definition
CREATE TABLE myfund
(
fcode VARCHAR(20) NOT NULL,
fname VARCHAR(20) NULL,
NAV DECIMAL(10, 4) NULL
COMMENT '單位凈值',
ACCNAV DECIMAL(10, 4) NULL
COMMENT '累計凈值',
updatetime DATETIME NULL,
fdate DATETIME NOT NULL
COMMENT '基金日期',
DGR VARCHAR(20) NULL
COMMENT '日增長率',
DGV VARCHAR(20) NULL
COMMENT '日增長值',
fee VARCHAR(20) NULL,
PRIMARY KEY (fcode, fdate)
)
ENGINE = InnoDB;
sql = """insert into myfund(fcode, fname,NAV,ACCNAV,updatetime)
values(%(fcode)s,%(fname)s,%(NAV)s,%(ACCNAV)s,%(updatetime)s)
ON duplicate KEY UPDATE `updatetime`=%(updatetime)s,NAV=%(NAV)s,ACCNAV=%(ACCNAV)s"""
這一句,你好像還要得到如 %(fcode)s這樣的數(shù)據(jù),這個數(shù)據(jù)是你前面得到的html分析而得到的。
舉個例子,你可以決定
datainform =dict()
datainform["fcode"] = "002003"
datainform["NAV"] = "cc"
execsql = sql%datainform
這樣就可以了。
我上面是給一個例子說明,具體的內(nèi)容要看你的整個軟件的想法決定的。
北大青鳥APTECH成立于1999年。依托北京大學(xué)優(yōu)質(zhì)雄厚的教育資源和背景,秉承“教育改變生活”的發(fā)展理念,致力于培養(yǎng)中國IT技能型緊缺人才,是大數(shù)據(jù)專業(yè)的國家
達(dá)內(nèi)教育集團(tuán)成立于2002年,是一家由留學(xué)海歸創(chuàng)辦的高端職業(yè)教育培訓(xùn)機(jī)構(gòu),是中國一站式人才培養(yǎng)平臺、一站式人才輸送平臺。2014年4月3日在美國成功上市,融資1
北大課工場是北京大學(xué)校辦產(chǎn)業(yè)為響應(yīng)國家深化產(chǎn)教融合/校企合作的政策,積極推進(jìn)“中國制造2025”,實現(xiàn)中華民族偉大復(fù)興的升級產(chǎn)業(yè)鏈。利用北京大學(xué)優(yōu)質(zhì)教育資源及背
博為峰,中國職業(yè)人才培訓(xùn)領(lǐng)域的先行者
曾工作于聯(lián)想擔(dān)任系統(tǒng)開發(fā)工程師,曾在博彥科技股份有限公司擔(dān)任項目經(jīng)理從事移動互聯(lián)網(wǎng)管理及研發(fā)工作,曾創(chuàng)辦藍(lán)懿科技有限責(zé)任公司從事總經(jīng)理職務(wù)負(fù)責(zé)iOS教學(xué)及管理工作。
浪潮集團(tuán)項目經(jīng)理。精通Java與.NET 技術(shù), 熟練的跨平臺面向?qū)ο箝_發(fā)經(jīng)驗,技術(shù)功底深厚。 授課風(fēng)格 授課風(fēng)格清新自然、條理清晰、主次分明、重點難點突出、引人入勝。
精通HTML5和CSS3;Javascript及主流js庫,具有快速界面開發(fā)的能力,對瀏覽器兼容性、前端性能優(yōu)化等有深入理解。精通網(wǎng)頁制作和網(wǎng)頁游戲開發(fā)。
具有10 年的Java 企業(yè)應(yīng)用開發(fā)經(jīng)驗。曾經(jīng)歷任德國Software AG 技術(shù)顧問,美國Dachieve 系統(tǒng)架構(gòu)師,美國AngelEngineers Inc. 系統(tǒng)架構(gòu)師。