酷代码 AI
菜单
服务商

手把手教你:从安居客大连二手房页面精准爬取关键信息!

def parser(html): doc = etree.HTML(html) # 将html转换为Element对象 out_list = [] for item in doc.xpath("//ul[@class='list-items']/li"): sell_point = item.xpath(".//div[@class='sell-point']/text()")[0].strip() building = item.xpath(".//div[@class='prop-title']/a/text()")[0].strip() address = item.xpath(".//div[@class='details-item']/span[1]/text()")[0].strip() house_type = item.xpath(".//div[@class='details-item']/span[2]/text()")[0].strip() area = item.xpath(".//div[@class='details-item']/span[3]/text()")[0].strip() floor = item.xpath(".//div[@class='details-item']/span[4]/text()")[0].strip() build_year = item.xpath(".//div[@class='details-item']/span[5]/text()")[0].strip() price = item.xpath(".//div[@class='pro-price']/strong/text()")[0].strip() url = item.xpath(".//div[@class='prop-title']/a/@href")[0].strip() out_list.append({ 'sell_point': sell_point, 'building': building, 'address': address, 'house_type': house_type, 'area': area, 'floor': floor, 'build_year': build_year, 'price': price, 'url': url }) return out_list ######[AI写代码神器 | 854点数解答 | 2024-11-02 08:56:33]

相关提问
本站限时免费提问中
实用工具查看更多
Linux在线手册 [开发类]
Jquery在线手册 [开发类]
今日油价 [生活类]
图片互转base64 [开发类]
时间转换器 [开发类]