当前位置：首页 > news >正文

做基因结构可以用哪个网站广州比较有名的网站建设公司

news 2026/5/5 5:56:09

做基因结构可以用哪个网站,广州比较有名的网站建设公司,上海app制作开发,中国建设银行网址是什么需求#xff1a;最近对python爬虫感兴趣#xff0c;于是也依葫芦画瓢试着用爬虫爬取之前喜欢的网站上的美女图片#xff0c;网站#xff1a;http://www.mm131.com/xinggan,其中每一套图都是一张一个页面#xff0c;存一套图如果是手动得点翻几十个页面#xff0c;但现在用… 需求最近对python爬虫感兴趣于是也依葫芦画瓢试着用爬虫爬取之前喜欢的网站上的美女图片网站http://www.mm131.com/xinggan,其中每一套图都是一张一个页面存一套图如果是手动得点翻几十个页面但现在用爬虫的话就很方便了只需输入套图的id轻轻松松就可以把美女存到硬盘了。大神说talk is cheap show me the code! 接下来说下一般网页爬虫的的过程 1.查看目标网站页面的源代码找到需要爬取的内容 2.用正则或其他如xpath/bs4的工具获取爬取内容 3.写出完整的python代码实现爬取过程 1.目标网址 urlhttp://www.mm131.com/xinggan/2373.html 美女图片漂亮吧 2.分析源代码 F12可以找到如下2行内容 srchttp://img1.mm131.com/pic/2373/1.jpg span classpage-ch共56页我们得到如下信息第一页的url为http://www.mm131.com/xinggan/2373.html第一行是第一页图片的的url其中2373是套图的id第二行看到这个套图有56张我们点击第二页和第三页继续看源码第二页和第三页的url为http://www.mm131.com/xinggan/2373_2.html2373_3.html图片url和第一页类似1.jpg变成2.jpg 3.爬取图片我们试着爬取第一个页面的图,直接上代码 import requests import re url http://www.mm131.com/xinggan/2373.html html requests.get(url).text #读取整个页面为文本 a re.search(rimg alt.* src(.*?) /,html,re.S) #匹配图片url print(a.group(1))/code 得到 http://img1.mm131.com/pic/2373/1.jpg 接下来我们需要把图片保存在本地 pic requests.get(a, timeout2) #time设置超时防止程序苦等 fp open(pic,wb) #以二进制写入模式新建一个文件 fp.write(pic.content) #把图片写入文件 fp.close() 这样你的本地就会有第一张美女图了第一张既然已经保存了那剩下的也都不要放过继续放代码 4.继续把代码补全载入所需模块并设置图片存放目录 #coding:utf-8 import requests import re import os from bs4 import BeautifulSoup pic_id raw_input(Input pic id: ) os.chdir(G:\pic) homedir os.getcwd() print(当前目录 %s % homedir ) fulldir unicode(os.path.join(homedir,pic_id),encodingutf-8) #图片保存在指定目录,并根据套图id设置目录 if not os.path.isdir(fulldir):os.makedirs(fulldir) 因为需要不停翻页才能获取图片所以我们先获取总页数 urlhttp://www.mm131.com/xinggan/%s.html % pic_id html requests.get(url).text #soup BeautifulSoup(html) soup BeautifulSoup(html, html.parser) #使用soup取关键字上一行会报错UserWarning: No parser was explicitly specified ye soup.span.string ye_count re.search(\d,ye) print(pages共%d页 % int(ye_count.group())) 主函数 def downpic(pic_id):n 1urlhttp://www.mm131.com/xinggan/%s.html % pic_idwhile n int(ye_count.group()): #翻完停止#下载图片try:if not n 1:urlhttp://www.mm131.com/xinggan/%s_%s.html % (pic_id,n) #url随着n的值变化的html requests.get(url).textpic_url re.search(rimg alt.* src(.*?) /,html,re.S) #使用正则去关键字pic_s pic_url.group(1)print(pic_s)pic requests.get(pic_s, timeout2)pic_cun fulldir \\ str(n) .jpgfp open(pic_cun,wb)fp.write(pic.content)fp.close()n 1except requests.exceptions.ConnectionError:print(【错误】当前图片无法下载)continue if __name__ __main__:downpic(pic_id) 程序跑起来

查看全文

http://www.hkea.cn/news/14537842/