由 Transfer-Encoding chunked 引起的 site issue
HTTP GET 请求
http GET 请求是最简单的请求类型. 在浏览器输入一个URL, 直接回车, 就是发送一个 http GET 请求. 一个简单的例子:
GET /path/to/resource?query=string HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, sdch
Connection: keep-alive
上面的最后一行header 之后, 会在发送一行只有\r\n
的行, 表示请求结束.
Transfer-Encoding: chunked 介绍
Transfer-Encoding 是 HTTP 1.x
版本的一个header, 设置 payload 传输时候的一种编码. 可能的编码格式有: chunked, compress, deflate, gzip. 可以同时设置多个兼容的值. 这个header 只适用于 hop to top, 不适用于整个连接. 如果你想在整个连接上使用压缩算法, 应该使用 Content-Encoding
header.
使用 Transfer-Encoding
的例子:
Transfer-Encoding: gzip, chunked
为什么要使用 chunked
假如一开始就知道要传输多长的payload 数据, 就可以使用 Content-Length header. chunked
: 一般因为一开始并不知道要传输多长的payload数据, 所以要一块一块传输, 在每一块的头上标注这一块有多长.
例子:
HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked
7\r\n
Mozilla\r\n
11\r\n
Developer Network\r\n
0\r\n
\r\n
当 HTTP GET 遇到 Transfer-Encoding: chunked
根据上面的介绍, HTTP GET 请求不应该包含 payload. 所以如果在header 里面误发了Transfer-Encoding: chunked
,会发生什么事情呢?
不同的服务器可能有不同的处理方式, 有的快速返回, 有的等待接受payload.
Tomcat 的处理方式
根据作者本地 debug 的实践, 到现在为止(20240713), Tomcat 的最新版本仍然是等待接收 chunked payload, 直到 socket read timeout.
这是某个Tomcat 版本等待读取 payload 的栈:
java.lang.Object.wait(Native Method)
org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.fillReadBuffer(NioEndpoint.java:1333)
org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.read(NioEndpoint.java:1234)
org.apache.coyote.http11.Http11InputBuffer.fill(Http11InputBuffer.java:785)
org.apache.coyote.http11.Http11InputBuffer.access$400(Http11InputBuffer.java:41)
org.apache.coyote.http11.Http11InputBuffer$SocketInputBuffer.doRead(Http11InputBuffer.java:1185)
org.apache.coyote.http11.filters.ChunkedInputFilter.readBytes(ChunkedInputFilter.java:310)
org.apache.coyote.http11.filters.ChunkedInputFilter.parseChunkHeader(ChunkedInputFilter.java:338)
org.apache.coyote.http11.filters.ChunkedInputFilter.doRead(ChunkedInputFilter.java:164)
org.apache.coyote.http11.filters.ChunkedInputFilter.end(ChunkedInputFilter.java:229)
org.apache.coyote.http11.Http11InputBuffer.endRequest(Http11InputBuffer.java:644)
org.apache.coyote.http11.Http11Processor.endRequest(Http11Processor.java:1184)
org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:430)
org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)
org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:926)
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1791)
org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)
org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
java.lang.Thread.run(Thread.java:750)
Tomcat 的逻辑是先逐行读取 header 信息, 直到读到 \r\n
行, 然后根据header设置需要的 InputFilter
列表(虽然是列表, 可能只有一个). 常见的 InputFilter
有:
- VoidInputFilter - 当 GET, HEAD 请求时用.
- ChunkedInputFilter - 当 chunked 的时候用.
一个请求的例子
下面是使用 python 写的一个发送 GET 请求并且设置 Transfer-Encoding: chunked
的例子:
import socket
from concurrent.futures import ThreadPoolExecutor
# Configuration
host = 'www.tianxiaohui.com'
port = 80
buffer_size = 4096
read_timeout = 100000 # Set read timeout to 10 seconds
def call():
# Create a socket object using IPv4 and TCP protocols
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Set the read timeout on the socket
client_socket.settimeout(read_timeout)
try:
# Connect to the server
client_socket.connect((host, port))
# Prepare the HTTP request data
http_request = ("GET /sell/marketing/v1/ad_campaign?limit=100&offset=0 HTTP/1.1\r\n"
f"Host: {host}\r\n"
"accept: application/json, text/json, text/x-json, text/javascript\r\n"
"accept-encoding: application/gzip, deflate\r\n"
"Transfer-Encoding: chunked\r\n"
"\r\n")
# Send the HTTP request to the server
client_socket.sendall(http_request.encode())
# Receive the response from the server
response = ''
while True:
part = client_socket.recv(buffer_size).decode()
if not part:
break
response += part
except socket.timeout:
print("Read timed out")
response = None
finally:
# Close the socket
client_socket.close()
# Return the response
return response
# Number of parallel calls
num_calls = 1
# Use ThreadPoolExecutor to execute the calls in parallel
with ThreadPoolExecutor(max_workers=num_calls) as executor:
# Submit all calls to the executor
future_calls = [executor.submit(call) for _ in range(num_calls)]
# Wait for all futures to complete and print their results
for future in future_calls:
response = future.result()
if response is not None:
print("Response:")
print(response)
如果改成对着本地的 tomcat 调用, 可以看到它等在那里20ms, 这20ms 就是读取完 header 之后, 等待读取 chunked payload, 却迟迟等不来的结果, 最后只有等到 read timeout.
这是在最新的 Tomcat 10.1.25 上得到的栈:
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(java.base@17.0.4.1/Native Method)
- waiting on <0x000000061a3aea90> (a java.util.concurrent.Semaphore)
at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.fillReadBuffer(NioEndpoint.java:1280)
- locked <0x000000061a3aea90> (a java.util.concurrent.Semaphore)
at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.read(NioEndpoint.java:1181)
at org.apache.coyote.http11.Http11InputBuffer.fill(Http11InputBuffer.java:789)
at org.apache.coyote.http11.Http11InputBuffer$SocketInputBuffer.doRead(Http11InputBuffer.java:1195)
at org.apache.coyote.http11.filters.ChunkedInputFilter.readBytes(ChunkedInputFilter.java:254)
at org.apache.coyote.http11.filters.ChunkedInputFilter.fill(ChunkedInputFilter.java:295)
at org.apache.coyote.http11.filters.ChunkedInputFilter.parseChunkHeader(ChunkedInputFilter.java:328)
at org.apache.coyote.http11.filters.ChunkedInputFilter.doRead(ChunkedInputFilter.java:136)
at org.apache.coyote.http11.filters.ChunkedInputFilter.end(ChunkedInputFilter.java:181)
at org.apache.coyote.http11.Http11InputBuffer.endRequest(Http11InputBuffer.java:646)
at org.apache.coyote.http11.Http11Processor.endRequest(Http11Processor.java:1188)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:429)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:904)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1741)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1190)
at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63)
at java.lang.Thread.run(java.base@17.0.4.1/Thread.java:833)