httpx的使用


HTTPX的使用

有些网站使用的是http/2.0的协议,这种情况下,urllilb和requests模块是不能爬取数据的,这个时候就要使用httpx.
官方文档

基本使用

httpx和requests很像,但是httpx有一个Client类,可以自定义协议,所以建议先实例化一个Client对象用于后续爬取.

import httpx

client = httpx.Client(http2 = True) ## 手动打开http2的使能,不然是默认http/1.1
response = client.get(url,headers = headers)
print(response.text)

Client对象

ClientSession对象类似,可以理解为维护爬虫的进程,可以随时结束进程,类似打开文件

示例代码:

with https.Client(http2 = True) as client:
    response = client.get(url,headers = headers)

异步请求

httpx还支持异步请求(即AsyncClient),支持Python的async请求模式,写法:

import httpx
import asyncio

async def fetch(url):
    async with httpx.AsyncClient(http2 = True) as client:
        response = client.get(url)
        print(response.text)

url = ''
asyncio.get_event_loop().run_until_complete(fetch(url))

Author: Dovahkiin
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Dovahkiin !
  TOC