探索 Java URL 连接的 read timeout 如何实现的
使用最朴实的 java.net.HttpURLConnection
来获取网页数据, 设置 timeout
然后通过OS层面的 tracing 工具去获取代码栈. 运行于 Ubuntu 22.04 和 OpenJDK 11. 由于JDK 11 使用了系统调用 poll
去获取网络事件, 然后使用了系统调用 recvfrom
来取的数据, 所以会拦截这2个系统调用. 另外可以看到下面使用perf 注入符号表, perf 和 bpftrace 都能看到 Java 编译的方法. 但是 strace 和 gdb 不行.
依次使用的工具:
- strace
- perf
- bpftrace
- gdb
java 代码
还是使用之前 chatGPT 给的这段代码. 注意这里设置的 connect timeout
和 read timeout
参数, 后面会看到.
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
public class URLTest {
public static void main(String[] args) throws Exception {
for (int i = 0; i < 1000; i++) {
try {
URL url = new URL("http://www.tianxiaohui.com");
HttpURLConnection con = (HttpURLConnection) url.openConnection();
con.setRequestMethod("GET");
con.setConnectTimeout(997); // 连接超时时间 997ms
con.setReadTimeout(719); // 读取超时时间 719ms
System.out.println(i + "Response code: " + con.getResponseCode());
BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
String line;
StringBuilder response = new StringBuilder();
while ((line = in.readLine()) != null) {
response.append(line);
}
in.close();
//System.out.println(response.toString());
} catch (Exception e) {
e.printStackTrace();
System.out.println(e.getMessage());
}
Thread.sleep(2000);
}
}
}
准备工作
运行程序使用以下代码:
$ javac URLTest.java
$ java -XX:+PreserveFramePointer -XX:-TieredCompilation -XX:CompileThreshold=1 URLTest
生成JIT 编译的代码符号表. 参看: bpftrace 探测 Java 运行时栈-实践
strace
直接对这个java程序使用 trace poll
和 recvfrom
event. 可以看到 poll
系统调用的timeout参数是997和719. 分别来自于JDK里面的C 代码 Java_java_net_PlainSocketImpl_socketConnect
和 Java_java_net_SocketInputStream_socketRead0
$ sudo strace --stack-trace -f -e trace=poll -p 1097556
strace: Process 1097556 attached with 18 threads
[pid 1097557] poll([{fd=5, events=POLLOUT}], 1, 997 <unfinished ...>
[pid 1097670] +++ exited with 0 +++
[pid 1097557] <... poll resumed>) = 1 ([{fd=5, revents=POLLOUT}])
> /usr/lib/x86_64-linux-gnu/libc.so.6(__poll+0x4f) [0x118dbf]
> /usr/lib/jvm/java-11-openjdk-amd64/lib/libnet.so(NET_Poll+0xab) [0xff2b]
> /usr/lib/jvm/java-11-openjdk-amd64/lib/libnet.so(Java_java_net_PlainSocketImpl_socketConnect+0x1ec) [0xd24c]
> unexpected_backtracing_error [0x7f93742cbd95]
[pid 1097557] poll([{fd=5, events=POLLIN|POLLERR}], 1, 719) = 1 ([{fd=5, revents=POLLIN}])
> /usr/lib/x86_64-linux-gnu/libc.so.6(__poll+0x4f) [0x118dbf]
> /usr/lib/jvm/java-11-openjdk-amd64/lib/libnet.so(NET_Timeout+0xed) [0x1019d]
> /usr/lib/jvm/java-11-openjdk-amd64/lib/libnet.so(Java_java_net_SocketInputStream_socketRead0+0xdf) [0xe75f]
> unexpected_backtracing_error [0x7f93742daa34]
$ sudo strace --stack-trace -f -e trace=recvfrom -p 1097556
strace: Process 1097556 attached with 18 threads
[pid 1097557] recvfrom(5, "HTTP/1.1 301 Moved Permanently\r\n"..., 8192, MSG_DONTWAIT, NULL, NULL) = 242
> /usr/lib/x86_64-linux-gnu/libc.so.6(recv+0x6e) [0x1278ae]
> /usr/lib/jvm/java-11-openjdk-amd64/lib/libnet.so(NET_NonBlockingRead+0xb0) [0xf1e0]
> /usr/lib/jvm/java-11-openjdk-amd64/lib/libnet.so(Java_java_net_SocketInputStream_socketRead0+0xf9) [0xe779]
> unexpected_backtracing_error [0x7f93742daa34]
perf
使用 perf 要先记录数据, 然后生成命令行report, 进入单个结果, 就能看到完整的栈. 这个栈是从上往下的, 最新的在下面.
$ sudo perf record -p 1097556 -e "syscalls:sys_enter_poll" -ag
$ ls -lah perf.data
$ sudo perf report
# 选择一个sample, enter 键进入detail 如下
Samples: 8 of event 'syscalls:sys_enter_poll', Event count (approx.): 8
Children Self Trace output
- 37.50% 37.50% ufds: 0x7f937753c438, nfds: 0x00000001, timeout_msecs: 0x000002cf ▒
start_thread ▒
ThreadJavaMain ▒
JavaMain ▒
jni_CallStaticVoidMethod ▒
jni_invoke_static ▒
JavaCalls::call_helper ▒
call_stub ▒
Interpreter ▒
Ljava/net/HttpURLConnection;::getResponseCode ▒
Lsun/net/www/protocol/http/HttpURLConnection;::getInputStream ▒
Lsun/net/www/protocol/http/HttpURLConnection;::getInputStream0 ▒
Lsun/net/www/http/HttpClient;::parseHTTP ▒
Lsun/net/www/http/HttpClient;::parseHTTPHeader ▒
Ljava/io/BufferedInputStream;::read ▒
Ljava/io/BufferedInputStream;::read1 ▒
Ljava/io/BufferedInputStream;::fill ▒
Ljava/net/SocketInputStream;::read ▒
Ljava/net/SocketInputStream;::socketRead0 ▒
Java_java_net_SocketInputStream_socketRead0 ▒
__poll ▒
+ 25.00% 25.00% ufds: 0x7f937754c378, nfds: 0x00000001, timeout_msecs: 0x000003e5 ▒
+ 12.50% 12.50% ufds: 0x7f937754a888, nfds: 0x00000001, timeout_msecs: 0x00000000 ▒
+ 12.50% 12.50% ufds: 0x7f937754a888, nfds: 0x00000001, timeout_msecs: 0x00001385 ▒
+ 12.50% 12.50% ufds: 0x7f937754a888, nfds: 0x00000001, timeout_msecs: 0x00001388
$ sudo perf record -p 1097556 -e "syscalls:sys_enter_recvfrom" -ag
$ sudo perf report
Samples: 2 of event 'syscalls:sys_enter_recvfrom', Event count (approx.): 2
Children Self Trace output
- 100.00% 100.00% fd: 0x00000005, ubuf: 0x7f937753c4c0, size: 0x00002000, flags: 0x00000040, addr: 0x00000000, addr_len: 0x00000000 ◆
start_thread ▒
ThreadJavaMain ▒
JavaMain ▒
jni_CallStaticVoidMethod ▒
jni_invoke_static ▒
JavaCalls::call_helper ▒
call_stub ▒
Interpreter ▒
Ljava/net/HttpURLConnection;::getResponseCode ▒
Lsun/net/www/protocol/http/HttpURLConnection;::getInputStream ▒
Lsun/net/www/protocol/http/HttpURLConnection;::getInputStream0 ▒
Lsun/net/www/http/HttpClient;::parseHTTP ▒
Lsun/net/www/http/HttpClient;::parseHTTPHeader ▒
Ljava/io/BufferedInputStream;::read ▒
Ljava/io/BufferedInputStream;::read1 ▒
Ljava/io/BufferedInputStream;::fill ▒
Ljava/net/SocketInputStream;::read ▒
Ljava/net/SocketInputStream;::socketRead0 ▒
Java_java_net_SocketInputStream_socketRead0 ▒
__libc_recv
bpftrace
使用 bpftrace 给出的 event 获取用户态的栈.
$ sudo bpftrace -e 'tracepoint:syscalls:sys_enter_poll /pid==1097556/{ @[ustack(20)] = count(); }'
Attaching 1 probe...
^C
@[
poll+79
Java_java_net_SocketInputStream_socketRead0+223
Ljava/net/SocketInputStream;::socketRead0+244
Ljava/net/SocketInputStream;::read+224
Ljava/io/BufferedInputStream;::fill+784
Ljava/io/BufferedInputStream;::read1+176
Ljava/io/BufferedInputStream;::read+252
Lsun/net/www/http/HttpClient;::parseHTTPHeader+444
Lsun/net/www/http/HttpClient;::parseHTTP+1004
Lsun/net/www/protocol/http/HttpURLConnection;::getInputStream0+1420
Lsun/net/www/protocol/http/HttpURLConnection;::getInputStream+196
Ljava/net/HttpURLConnection;::getResponseCode+96
Interpreter+4352
call_stub+138
JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+883
jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .constprop.1]+682
jni_CallStaticVoidMethod+352
JavaMain+3441
ThreadJavaMain+13
start_thread+755
]: 1
$ sudo bpftrace -e 'tracepoint:syscalls:sys_enter_recvfrom /pid==1097556/{ @[ustack(20)] = count(); }'
Attaching 1 probe...
^C
@[
recvfrom+116
]: 2
@[
__GI___recv+110
Java_java_net_SocketInputStream_socketRead0+249
Ljava/net/SocketInputStream;::socketRead0+244
Ljava/net/SocketInputStream;::read+224
Ljava/io/BufferedInputStream;::fill+784
Ljava/io/BufferedInputStream;::read1+176
Ljava/io/BufferedInputStream;::read+252
Lsun/net/www/http/HttpClient;::parseHTTPHeader+444
Lsun/net/www/http/HttpClient;::parseHTTP+1004
Lsun/net/www/protocol/http/HttpURLConnection;::getInputStream0+1420
Lsun/net/www/protocol/http/HttpURLConnection;::getInputStream+196
Ljava/net/HttpURLConnection;::getResponseCode+96
Interpreter+4352
call_stub+138
JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+883
jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .constprop.1]+682
jni_CallStaticVoidMethod+352
JavaMain+3441
ThreadJavaMain+13
start_thread+755
]: 4
gdb
使用gdb连接, 然后通过设置段点来找到我们期望的栈. 这个看到的更多一些, 包括从 Java_java_net_SocketInputStream_socketRead0
到 glibc 的 __GI___poll
中间的2步: NET_ReadWithTimeout
和NET_Timeout
.
# gdb 连接
$ sudo gdb --pid=1097556
# 找到我们需要的线程
$ thread 2
#设置断点, 先设置一个 socketRead的, 如果直接设置
$ break Java_java_net_SocketInputStream_socketRead0
$ cont # 到上面这个断点停止
$ bt # 查看是不是我们想要的栈
$ break __GI___poll
$ cont # 这样就会到我们想要的点
(gdb) break __GI___poll
Breakpoint 5 at 0x7f9378c6dd70: file ../sysdeps/unix/sysv/linux/poll.c, line 27.
(gdb) cont
Continuing.
Thread 2 "java" hit Breakpoint 5, __GI___poll (fds=fds@entry=0x7f937753c438, nfds=nfds@entry=1, timeout=719, timeout@entry=<error reading variable: That operation is not available on integers of more than 8 bytes.>) at ../sysdeps/unix/sysv/linux/poll.c:27
27 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) bt
#0 __GI___poll (fds=fds@entry=0x7f937753c438, nfds=nfds@entry=1, timeout=719,
timeout@entry=<error reading variable: That operation is not available on integers of more than 8 bytes.>) at ../sysdeps/unix/sysv/linux/poll.c:27
#1 0x00007f937402419d in poll (__timeout=719, __nfds=1, __fds=0x7f937753c438) at /usr/include/x86_64-linux-gnu/bits/poll2.h:39
#2 NET_Timeout (env=env@entry=0x7f937001bb48, s=s@entry=5, timeout=<error reading variable: That operation is not available on integers of more than 8 bytes.>,
nanoTimeStamp=nanoTimeStamp@entry=1643674031468882) at ./src/java.base/linux/native/libnet/linux_close.c:433
#3 0x00007f937402275f in NET_ReadWithTimeout (timeout=<optimized out>, len=8192, bufP=0x7f937753c4c0 "", fd=5, env=0x7f937001bb48)
at ./src/java.base/unix/native/libnet/SocketInputStream.c:55
#4 Java_java_net_SocketInputStream_socketRead0 (env=0x7f937001bb48, this=<optimized out>, fdObj=<optimized out>, data=0x7f937754c518, off=0, len=8192, timeout=719)
at ./src/java.base/unix/native/libnet/SocketInputStream.c:127
#5 0x00007f93742daa34 in ?? ()
#6 0x00000000000002cf in ?? ()
timeout 的实现
根据上面的栈信息, 我们很容易就知道了我们期望的 timeout 如何使用的:
- Java_java_net_SocketInputStream_socketRead0 - https://github.com/openjdk/jdk11/blob/37115c8ea4aff13a8148ee2b8832b20888a5d880/src/java.base/unix/native/libnet/SocketInputStream.c#L91C1-L91C44
- NET_ReadWithTimeout - https://github.com/openjdk/jdk11/blob/37115c8ea4aff13a8148ee2b8832b20888a5d880/src/java.base/unix/native/libnet/SocketInputStream.c#L50
- NET_Timeout - https://github.com/openjdk/jdk11/blob/master/src/java.base/linux/native/libnet/linux_close.c#L415
- poll - https://github.com/openjdk/jdk11/blob/master/src/java.base/linux/native/libnet/linux_close.c#L441
其中真正使用 timeout 地方在于 poll
和 Java_java_net_SocketInputStream_socketRead0 里面的循环.
如果改成 https
改成 https 连接 connect 的栈:
poll+79
Java_java_net_PlainSocketImpl_socketConnect+492
Ljava/net/PlainSocketImpl;::socketConnect+213
Ljava/net/AbstractPlainSocketImpl;::doConnect+328
Ljava/net/AbstractPlainSocketImpl;::connect+444
Ljava/net/SocksSocketImpl;::connect+2528
Ljava/net/Socket;::connect+592
Lsun/security/ssl/SSLSocketImpl;::connect+96
Lsun/net/NetworkClient;::doConnect+432
Lsun/net/www/http/HttpClient;::openServer+72
Lsun/net/www/http/HttpClient;::openServer+348
Lsun/net/www/protocol/https/HttpsClient;::<init>+972
Lsun/net/www/protocol/https/HttpsClient;::New+1444
Lsun/net/www/protocol/http/HttpURLConnection;::plainConnect0+3272
Lsun/net/www/protocol/http/HttpURLConnection;::plainConnect+256
Lsun/net/www/protocol/https/AbstractDelegateHttpsURLConnection;::connect+100
Lsun/net/www/protocol/http/HttpURLConnection;::getInputStream0+1996
Lsun/net/www/protocol/http/HttpURLConnection;::getInputStream+192
Ljava/net/HttpURLConnection;::getResponseCode+116
LURLTest;::main+660
改成 https 连接 read 的栈:
poll+79
Java_java_net_SocketInputStream_socketRead0+223
Ljava/net/SocketInputStream;::socketRead0+244
Ljava/net/SocketInputStream;::read+204
Lsun/security/ssl/SSLSocketInputRecord;::read+52
Lsun/security/ssl/SSLSocketInputRecord;::readHeader+144
Lsun/security/ssl/SSLSocketInputRecord;::bytesInCompletePacket+72
Lsun/security/ssl/SSLSocketImpl;::readApplicationRecord+348
Lsun/security/ssl/SSLSocketImpl$AppInputStream;::read+552
Ljava/io/BufferedInputStream;::fill+760
Ljava/io/BufferedInputStream;::read+676
Lsun/net/www/http/ChunkedInputStream;::fastRead+116
Lsun/net/www/http/ChunkedInputStream;::read+584
Lsun/net/www/protocol/http/HttpURLConnection$HttpInputStream;::read+116
Lsun/nio/cs/StreamDecoder;::readBytes+444
Lsun/nio/cs/StreamDecoder;::implRead+392
Lsun/nio/cs/StreamDecoder;::read+432
Ljava/io/BufferedReader;::fill+724
Ljava/io/BufferedReader;::readLine+1132
LURLTest;::main+304