为了完成上述任务,我将分步骤提供代码。首先,我会提供任务一的代码,用于从网站采集唐诗三百首的内容并保存到txt文件中。 ### 任务一:采集唐诗三百首内容 ```python import requests import re # 定义爬取函数 def fetch_poems(url): response = requests.get(url) poems_info = re.findall( r'<a href="(.*?)"\s*target="_blank" title="(.*?)">\s*<span\s*style="font-size:16px;">(.*?)</span>\s*</a>', response.text, re.s ) poems = [] for info in poems_info: poem_url = "https://so.gushiwen.cn" + info[0] poem_response = requests


```python poem_response = requests.get(poem_url) poem_content = re.findall(r'<div class="contson" id="contson.*?>(.*?)</div>', poem_response.text, re.s)[0] poems.append((info[1], info[2], poem_content)) return poems # 爬取唐诗三百首 url = "https://so.gushiwen.cn/gushi/tangshi.aspx" poems = fetch_poems(url) # 保存到txt文件 with open("唐诗三百首.txt", "w", encoding="utf-8") as f: for poem in poems: f.write(f"{poem[0]} {poem[1]} {poem[2]} ") ``` [2024-06-23 11:07:39 | AI写代码神器 | 208点数解答]
相关提问
- 为了完成上述任务,我将分步骤提供代码。首先,我会提供任务一的代码,用于从网站采集唐诗三百首的内容并保存到txt文件中。 ### 任务一:采集唐诗三百首内容 ```python import requests import re # 定义爬取函数 def fetch_poems(url): response = requests.get(url) poems_info = re.findall( r'<a href="(.*?)"\s*target="_blank" title="(.*?)">\s*<span\s*style="font-size:16px;">(.*?)</span>\s*</a>', response.text, re.s ) poems = [] for info in poems_info: poem_url = "https://so.gushiwen.cn" + info[0] poem_response = requests(208点数解答 | 2024-06-23 11:07:39)227
- 任务一:采集唐诗三百首内容 python import requests from bs4 import beautifulsoup import pandas as pd url = "https://so.gushiwen.cn/gushi/tangshi.aspx" response = requests.get(url) soup = beautifulsoup(response.text, "html.parser") poems = [] for poem in soup.find_all("div", class_="item"): title = poem.find("h1").text.strip() content = poem.find("p").text.strip() author = poem.find("span", class_="author").text.strip() poem_type = "唐诗" poems.append((15点数解答 | 2024-06-24 15:46:11)280
- import requests from bs4 import beautifulsoup def get_poems(): url = "https://so.gushiwen.cn/gushi/tangshi.aspx" response = requests.get(url) response.encoding = "utf-8" soup = beautifulsoup(response.text, "html.parser") # 获取所有的诗列表 poem_list = soup.find_all("div", class_="typecont") poems = [] # 遍历诗列表,提取相关信息 for poem in poem_list: # 获取诗类型 poem_type = poem.find_previous_sibling("div", class_="bookml").find("span").text # 获取诗题目(316点数解答 | 2024-06-24 00:59:22)235
- import requests from bs4 import beautifulsoup import pandas as pd url = "https://so.gushiwen.cn/gushi/tangshi.aspx" response = requests.get(url) soup = beautifulsoup(response.text, 'html.parser') # 提取所需数据 poems_data = [] for row in soup.find_all('tr', class_='tlist')[1:]: cols = row.find_all('td') type_ = cols.text.strip() title = cols.a.text.strip() content = cols.text.replace('\n', '').replace('\t', '') author = cols.text.strip() poems_data.append([type_, title, conte(120点数解答 | 2024-06-24 02:04:56)260
- import requests from bs4 import beautifulsoup # 爬取唐诗三百首的内容 def scrape_data(): url = "https://so.gushiwen.cn/gushi/tangshi.aspx" headers = { "user-agent": "mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, like gecko) chrome/89.0.4389.82 safari/537.36" } response = requests.get(url, headers=headers) if response.status_code == 200: soup = beautifulsoup(response.text, "html.parser") content_divs = soup.find_all('div', c(443点数解答 | 2024-06-24 01:16:00)272
- jingyu: { audio: 2, trigger: { global: ["useskill", "logskillbegin", "usecard", "respond"], }, filter(event, player) { if (["global", "equip"].includes(event.type)) return false; let skill = event.sourceskill || event.skill; if (!skill || skill === "jingyu") return false; let info = get.info(skill); while (true) { if (!info || info.charlotte || info.equipskill) return false; if (info && !info.sourceskill) break; skill = info.sourceskill; info = get.info((15点数解答 | 2024-06-06 17:40:23)285
- skill={ trigger:{ player:"shaBegin", }, direct:true, filter:function (event,player){ return event.target.hp>0&&event.target.countCards('he')>0; }, audio:2, logTarget:"target", content:function (){ 'step 0' player.choosePlayerCard(trigger.target,'he', [1,Math.min(trigger.target.countCards('he'),trigger.target.hp)],get.prompt('pojun',trigger.target)); 'step 1' if(result.bool&&result.links.length){ player.logSkill('xinpojun'); if(trigger.target.storage.xinpojun2){ trigger.target.storage.xinpojun2=t(288点数解答 | 2025-07-06 23:39:34)99
- 编写一个爬虫实现深度优先爬虫,爬取的网站为 www.baidu.com。使其输出预期为:Add the seeds url ['http://www.baidu.com'] to the unvisited url list Pop out one url "http://www.baidu.com" from unvisited url list Get 10 new links Visited url count: 1 Visited deepth: 1 10 unvisited links: Pop out one url "http://news.baidu.com" from unvisited url list Get 52 new links Visited url count: 2 Visited deepth: 2 Pop out one url "http://www.hao123.com" from unvisited url list Get 311 new links Visited url count: 3 Visited deepth: 2 Pop out(2142点数解答 | 2025-05-13 15:54:49)189
- import requests import parsel # 解析数据 import os # python学习交流 1群 815624229 快满了加2裙 # python学习交流 2群 279199867 for page in range(1, 12): response = requests.get(f'https://file.kole8.com/zhuanti/file_upload/20251/page/{page}') # <response [200]>: 请求成功 # 静态页面的网页源代码 # 把静态页面的 字符串 变成可选的对象 selector = parsel.selector(response.text) # 3. 数据解析(筛选 过滤咱们需要的数据) # re: 可以直接提取网页字符串数据 # css / xpath: 通过html标签进行提取 lis = selector.xpath("//ul[@class='b2_gap ']/li") for li in lis(119点数解答 | 2025-01-05 12:49:42)204
- import requests import parsel # 解析数据 import os # python学习交流 1群 815624229 快满了加2裙 # python学习交流 2群 279199867 for page in range(1, 12): response = requests.get(f'https://file.kole8.com/zhuanti/file_upload/20251/page/{page}') # <response [200]>: 请求成功 # 静态页面的网页源代码 # 把静态页面的 字符串 变成可选的对象 selector = parsel.selector(response.text) # 3. 数据解析(筛选 过滤咱们需要的数据) # re: 可以直接提取网页字符串数据 # css / xpath: 通过html标签进行提取 lis = selector.xpath("//ul[@class='b2_gap ']/li") for li in lis(172点数解答 | 2025-01-05 12:50:16)178
- import requests import parsel # 解析数据 import os # python学习交流 1群 815624229 快满了加2裙 # python学习交流 2群 279199867 for page in range(1, 12): response = requests.get(f'https://file.kole8.com/zhuanti/file_upload/20251/page/{page}') # <response [200]>: 请求成功 # 静态页面的网页源代码 # 把静态页面的 字符串 变成可选的对象 selector = parsel.selector(response.text) # 3. 数据解析(筛选 过滤咱们需要的数据) # re: 可以直接提取网页字符串数据 # css / xpath: 通过html标签进行提取 lis = selector.xpath("//ul[@class='b2_gap ']/li") for li in lis(530点数解答 | 2025-01-05 12:50:26)132
- 帮我写一个批处理文件,内容如下: 1、如果在"参数表.xlsx"a列中的值找到与"文件.txt"中相同的值,则把此行中p=的值改为"参数表.xlsx"中c列对应的值; 2、如果在"参数表.xlsx"a列中的值找到与"文件.txt"中相同的值,则把此行中t=的值改为"参数表.xlsx"中b列对应的值; 3、如果在"参数表.xlsx"a列中的值找到与"文件.txt"中相同的值,则把此行中s=的值改为"参数表.xlsx"中d列对应的值; 如: 参数表.xlsx中a列a1,a2,a3;b列1.5,2,3;c列为10,20,30;d列100,101,102; 文件.txt中的行 1:l p[54:a1] 2000mm/sec cnt100 spot[sd=1,p=25,t=2.0,s=10,ed=1]; 2:l p[57:a2] 2000mm/sec cnt100 spot[sd=1,p=25,t=2.0,s=11,ed=1]; 3:l p[67:a3] 2000mm/sec cnt100 spot[sd=1,p=25,t=2.0,s=12,ed=1]; 文件.txt修改后为 1:l p[54(811点数解答 | 2024-12-05 12:58:37)256