Browse Source

翻页日志输出

maxiaoshan 2 years ago
parent
commit
2ac844cb98
2 changed files with 107 additions and 1 deletions
  1. 106 0
      src/logs/spider.log
  2. 1 1
      src/spider/spider.go

+ 106 - 0
src/logs/spider.log

@@ -191514,3 +191514,109 @@ stack traceback:
 2022/11/01 11:14:53 spider.go:777: info  Running Code: a_zgzbtbggfwpt_zhbhxrgs2 Stop: false
 2022/11/01 11:14:53 spider.go:843: info  Thread Info:	Code: a_zgzbtbggfwpt_zhbhxrgs2 	count: 1 	thread num: 0
 2022/11/01 11:15:04 upload.go:114: debug  上传文件成功! a_zgzbtbggfwpt_zhbhxrgs2 	 https://details.cebpubservice.com:7443/bulletin/getBulletin/8a9494757a859f1701823393ef287237 	 98033bff22afc8ac4e835399a4613a1dcb64f5bc0f10c1a0e29dd2b05ca30f18.pdf 	 金牛湖野生动物王国AAAA创建暨综合提升项目(消防系统升级)项目专项设计中标候选人公示.pdf 177 KB
+2022/11/25 14:43:02 main.go:139: debug  7400
+2022/11/25 14:43:02 spider.go:1088: info  Detail Download All Thread: 0
+2022/11/25 14:43:02 handler.go:405: info  节点 7100 脚本文件爬虫数 0
+2022/11/25 14:43:02 handler.go:405: info  节点 7100 脚本文件爬虫数 0
+2022/11/25 14:43:02 spider.go:181: debug  a_zgzcgylgldzcgpt_jjgg 中国中车供应链管理电子采购平台 频率: 30 , 150
+2022/11/25 14:43:02 handler.go:136: info  高性能模式:LUA加载完成
+2022/11/25 14:43:02 handler.go:142: info  总共加载脚本数: 1
+2022/11/25 14:43:16 spider.go:410: info  a_zgzcgylgldzcgpt_jjgg 本轮列表页采集详情: 158 130 13 false
+2022/11/25 14:43:17 spider.go:166: debug  a_zgzcgylgldzcgpt_jjgg 中国中车供应链管理电子采购平台 ok,本轮下载量: 0 ,轮询数据长度: 1 ,下线数量: 0 ,下线爬虫: []
+2022/11/25 14:44:45 main.go:139: debug  7400
+2022/11/25 14:44:45 spider.go:1088: info  Detail Download All Thread: 0
+2022/11/25 14:44:45 handler.go:405: info  节点 7100 脚本文件爬虫数 0
+2022/11/25 14:44:45 handler.go:405: info  节点 7100 脚本文件爬虫数 0
+2022/11/25 14:44:45 spider.go:181: debug  a_zgzcgylgldzcgpt_jjgg 中国中车供应链管理电子采购平台 频率: 30 , 150
+2022/11/25 14:44:45 handler.go:136: info  高性能模式:LUA加载完成
+2022/11/25 14:44:45 handler.go:142: info  总共加载脚本数: 1
+2022/12/06 10:01:23 main.go:139: debug  7400
+2022/12/06 10:01:23 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:01:23 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:01:23 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:01:23 handler.go:136: info  高性能模式:LUA加载完成
+2022/12/06 10:01:23 handler.go:142: info  总共加载脚本数: 0
+2022/12/06 10:02:04 main.go:139: debug  7400
+2022/12/06 10:02:04 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:02:04 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:02:04 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:02:04 spider.go:181: debug  sd_zgsdzfcgw_sxzhbgg_new_bu 中国山东政府采购网 频率: 30 , 150
+2022/12/06 10:02:04 handler.go:136: info  高性能模式:LUA加载完成
+2022/12/06 10:02:04 handler.go:142: info  总共加载脚本数: 1
+2022/12/06 10:06:07 main.go:139: debug  7400
+2022/12/06 10:06:07 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:06:07 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:06:07 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:06:07 spider.go:181: debug  sd_zgsdzfcgw_sxzhbgg_new_bu 中国山东政府采购网 频率: 30 , 150
+2022/12/06 10:06:08 handler.go:136: info  高性能模式:LUA加载完成
+2022/12/06 10:06:08 handler.go:142: info  总共加载脚本数: 1
+2022/12/06 10:09:52 main.go:139: debug  7400
+2022/12/06 10:09:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:09:52 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:09:52 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:09:52 spider.go:181: debug  sd_zgsdzfcgw_sxzhbgg_new_bu 中国山东政府采购网 频率: 30 , 150
+2022/12/06 10:09:52 handler.go:136: info  高性能模式:LUA加载完成
+2022/12/06 10:09:52 handler.go:142: info  总共加载脚本数: 1
+2022/12/06 10:10:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:11:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:12:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:13:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:14:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:15:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:16:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:17:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:18:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:19:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:20:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:21:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:22:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:23:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:24:52 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:24:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:25:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:26:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:27:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:28:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:29:52 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:30:40 main.go:139: debug  7400
+2022/12/06 10:30:40 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:30:40 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:30:40 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:30:41 spider.go:181: debug  sd_zgsdzfcgw_sxzhbgg_new_bu 中国山东政府采购网 频率: 30 , 150
+2022/12/06 10:30:41 handler.go:136: info  高性能模式:LUA加载完成
+2022/12/06 10:30:41 handler.go:142: info  总共加载脚本数: 1
+2022/12/06 10:31:39 main.go:139: debug  7400
+2022/12/06 10:31:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:31:39 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:31:39 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:31:39 spider.go:181: debug  sd_zgsdzfcgw_sxzhbgg_new_bu 中国山东政府采购网 频率: 30 , 150
+2022/12/06 10:31:39 handler.go:136: info  高性能模式:LUA加载完成
+2022/12/06 10:31:39 handler.go:142: info  总共加载脚本数: 1
+2022/12/06 10:32:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:33:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:34:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:35:13 download.go:129: error  sd_zgsdzfcgw_sxzhbgg_new_bu方法DownloadAdv,url:http://www.ccgp-shandong.gov.cn/sdgp2017/site/listnew.jsp,err:timeout 150
+2022/12/06 10:35:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:36:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:37:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:38:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:39:10 download.go:129: error  sd_zgsdzfcgw_sxzhbgg_new_bu方法DownloadAdv,url:http://www.ccgp-shandong.gov.cn/sdgp2017/site/listnew.jsp,err:timeout 150
+2022/12/06 10:39:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:40:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:41:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:42:01 download.go:129: error  sd_zgsdzfcgw_sxzhbgg_new_bu方法DownloadAdv,url:http://www.ccgp-shandong.gov.cn/sdgp2017/site/listnew.jsp,err:timeout 150
+2022/12/06 10:42:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:43:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:44:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:45:39 spider.go:1089: info  Detail Download All Thread: 0
+2022/12/06 10:58:55 main.go:139: debug  7400
+2022/12/06 10:58:55 spider.go:1088: info  Detail Download All Thread: 0
+2022/12/06 10:58:55 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:58:55 handler.go:405: info  节点 7400 脚本文件爬虫数 0
+2022/12/06 10:58:55 spider.go:181: debug  sd_zgsdzfcgw_sxzhbgg_new_bu 中国山东政府采购网 频率: 30 , 150
+2022/12/06 10:58:55 spider.go:290: info  重复页: 0 	配置最大页: 1377 	最终最大页: 1377 	当前页: 1 重复次数: 0
+2022/12/06 10:58:56 handler.go:136: info  高性能模式:LUA加载完成
+2022/12/06 10:58:56 spider.go:764: info  +++++++++++++++++++Download Detail+++++++++++++++++++
+2022/12/06 10:58:56 handler.go:142: info  总共加载脚本数: 1
+2022/12/06 10:58:56 spider.go:777: info  Running Code: sd_zgsdzfcgw_sxzhbgg_new_bu Stop: false
+2022/12/06 10:58:56 spider.go:843: info  Thread Info:	Code: sd_zgsdzfcgw_sxzhbgg_new_bu 	count: 2944 	thread num: 10

+ 1 - 1
src/spider/spider.go

@@ -287,7 +287,7 @@ func (s *Spider) DownListPageItem() (errs interface{}) {
 		if !s.Stop { //在下载详情页时爬虫下架,此时不再存心跳信息
 			UpdateHeart(s.Name, s.Channel, s.Code, s.MUserName, "list") //记录所有节点列表页心跳
 		}
-		//qu.Debug("重复页:", repeatPageNum, "	配置最大页:", tmpMax, "	最终最大页:", max, "	当前页:", start, "重复次数:", repeatPageTimes)
+		logger.Info("重复页:", repeatPageNum, "	配置最大页:", tmpMax, "	最终最大页:", max, "	当前页:", start, "重复次数:", repeatPageTimes)
 		//if start > tmpMax && isRunRepeatList && repeatPageTimes >= 5 { //重复次数超过5次,不再翻页
 		//	break
 		//}