酷代码 AI
菜单
服务商

手把手教你:从安居客大连二手房页面精准爬取关键信息的解析函数

def parser(html): doc = etree.HTML(html) out_list = [] for row in doc.xpath("//ul[@class='sale-list']//li"): sellingpoint = row.xpath(".//div[@class='pro-title']/text()")[0].strip() communityname = row.xpath(".//div[@class='house-title']/a/text()")[0].strip() address = row.xpath(".//address/text()")[0].strip() housetype = row.xpath(".//div[@class='details-item'][1]/span[2]/text()")[0].strip() buildingarea = row.xpath(".//div[@class='details-item'][2]/span[2]/text()")[0].strip() floor = row.xpath(".//div[@class='details-item'][3]/span[2]/text()")[0].strip() constructionyear = row.xpath(".//div[@class='details-item'][4]/span[2]/text()")[0].strip() price = row.xpath(".//span[@class='price-det']/text()")[0].strip() out_list.append({ 'sellingpoint': sellingpoint, 'communityname': communityname, 'address': address, 'housetype': housetype, 'buildingarea': buildingarea, 'floor': floor, 'constructionyear': constructionyear, 'price': price }) return out_list ######[AI写代码神器 | 792点数解答 | 2024-11-02 08:58:39]

相关提问
本站限时免费提问中
实用工具查看更多
Linux在线手册 [开发类]
Jquery在线手册 [开发类]
今日油价 [生活类]
图片互转base64 [开发类]
时间转换器 [开发类]