問題: 取出日志中的 status 和 request_time的值:
此日志用腳本無法匹配出來:
{"@timestamp":"2018-03-30T19:24:26+08:00","server_addr":"172.31.0.24","remote_addr":"10.59.23.86","scheme":"https","host":"api.mycomapp.com","method":"GET","uri":"/app/global/2/android.json?mark=gif&version=96&app=&language=en","url":"/app/global/2/android.json","protocol":"HTTP/1.1","status":"200","size":10206,"request_time":"0.159","upstream_time":"0.159","upstream_addr":"192.31.2.78:80","referer":"-","agent":"Mozilla/5.0 (Linux; Android 6.0.1; SM-G900M Build/MMB29M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/63.0.3239.111 Mobile Safari/537.36 News/96 Android/96 AppFootball/96 SDK/23 PackageName/com.mycom.news","http_x_forwarded_for":"-","uuid":"@hu87x0VcRPqJ9SwLZtQVpJWArHrnN7xA/j0cHVxLDY/ZUVIxXhZn","authorization":"-","lang":"en-US","route_id":"4c64bfcc996f11381e19d85b2e8ce9f3","product":"mycom","subsys":"api","uuidx":"-","version_name":"2.6.2","package":"com.mycom.news","auid":"-"}
python代碼如下:
#!/usr/bin/env python
import re
import sys
regex = re.compile(r'{".*"status":(\d+).*request_time":(\d+\.\d+)')
def process_line(line):
ro = regex.match(line)
if ro :
status , reqtime= ro.groups()
return status,reqtime
if __name__ == '__main__':
for line in sys.stdin:
print process_line(line)
如下方式執(zhí)行:
# cat t.log |python logclean.py
None
可以匹配出來的日志:
{"@timestamp":"2018-03-30T18:31:27+08:00","server_addr":"192.31.3.181","remote_addr":"197.210.173.163","scheme":"http","host":"api.abccccdapp.com","method":"GET","uri":"/app/archives/info?id=614464&language=en","url":"/index.php","protocol":"HTTP/1.0","status":200,"size":390,"request_time":0.006,"upstream_time":"0.006","upstream_host":"127.0.0.1:9000","referer":"abccccd://v1/main/home/tablist/http://app.abccccdapp.com/navite?push","agent":"Mozilla/5.0 (Linux; Android 7.1.1; SM-C7108 Build/NMF26X; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/59.0.3071.125 Mobile Safari/537.36 News/100 Android/100 App/100 SDK/25 PackageName/com.abccccd.news","xff":"197.21.173.163","uuid":"@nBFS5qbkHR3H0rJ2tFlR4cJsz","authorization":"-","lang":"en-US","route_id":"8c2352d0045f02c79722cb03261c0f88","sign":"MGD8xl2Ue6lagzzorNUnsCFNpsb7QcOVWf1QI="}
# cat 2.txt|python logclean.py
('200', '0.006')
正則沒有匹配到,返回None,請問各位大神正則應該怎么寫?謝謝
日志不一樣,有的帶引號,有的是不帶引號,你正則沒有匹配到帶引號的情況。
json化
>>> import json
>>> s = json.loads(line)
>>> s['status']
'200'
>>> s['request_time']
'0.159'
正則匹配
>>> import re
>>> p = re.compile('"status":"?(?P<status>\d+)"?.*?"request_time":"?(?P<request_time>\d+\.\d+)"?')
>>> p.search(line).group('status')
'200'
>>> p.search(line).group('request_time')
'0.159'北大青鳥APTECH成立于1999年。依托北京大學優(yōu)質(zhì)雄厚的教育資源和背景,秉承“教育改變生活”的發(fā)展理念,致力于培養(yǎng)中國IT技能型緊缺人才,是大數(shù)據(jù)專業(yè)的國家
達內(nèi)教育集團成立于2002年,是一家由留學海歸創(chuàng)辦的高端職業(yè)教育培訓機構,是中國一站式人才培養(yǎng)平臺、一站式人才輸送平臺。2014年4月3日在美國成功上市,融資1
北大課工場是北京大學校辦產(chǎn)業(yè)為響應國家深化產(chǎn)教融合/校企合作的政策,積極推進“中國制造2025”,實現(xiàn)中華民族偉大復興的升級產(chǎn)業(yè)鏈。利用北京大學優(yōu)質(zhì)教育資源及背
博為峰,中國職業(yè)人才培訓領域的先行者
曾工作于聯(lián)想擔任系統(tǒng)開發(fā)工程師,曾在博彥科技股份有限公司擔任項目經(jīng)理從事移動互聯(lián)網(wǎng)管理及研發(fā)工作,曾創(chuàng)辦藍懿科技有限責任公司從事總經(jīng)理職務負責iOS教學及管理工作。
浪潮集團項目經(jīng)理。精通Java與.NET 技術, 熟練的跨平臺面向?qū)ο箝_發(fā)經(jīng)驗,技術功底深厚。 授課風格 授課風格清新自然、條理清晰、主次分明、重點難點突出、引人入勝。
精通HTML5和CSS3;Javascript及主流js庫,具有快速界面開發(fā)的能力,對瀏覽器兼容性、前端性能優(yōu)化等有深入理解。精通網(wǎng)頁制作和網(wǎng)頁游戲開發(fā)。
具有10 年的Java 企業(yè)應用開發(fā)經(jīng)驗。曾經(jīng)歷任德國Software AG 技術顧問,美國Dachieve 系統(tǒng)架構師,美國AngelEngineers Inc. 系統(tǒng)架構師。