This article details the integration of Arthas into Apache DolphinScheduler to enable real-time monitoring of API calls. Arthas, a powerful Java diagnostic tool, assists developers in inspecting the runtime status, identifying performance bottlenecks, and tracking method calls. Embedding Arthas in DolphinScheduler allows for the capture of key call information during task scheduling, enabling timely issue detection and resolution for improved system stability. Here, we outline the steps to start Arthas within the DolphinScheduler environment, monitor specific API calls, and analyze the collected performance data to enhance scheduling reliability and maintainability.
https://arthas.aliyun.com/download/latest_version?mirror=aliyun arthas-packaging-3.7.2-bin.zip cp arthas-packaging-3.7.2-bin.zip /opt/arthas cd /opt/arthas unzip arthas-packaging-3.7.2-bin.zip java -jar arthas-boot.jar Select the corresponding process ID.
[ERROR] Start arthas failed, exception stack trace: com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106) at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78) at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250) at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:102) at com.taobao.arthas.core.Arthas.<init>(Arthas.java:27) at com.taobao.arthas.core.Arthas.main(Arthas.java:161)
Solution:
In ${DOLPHINSCHEUDLER_HOME}/api-server/bin, add the following line to jvm_args_env.sh:
-XX:+StartAttachListener
Picked up JAVA_TOOL_OPTIONS: java.io.IOException: well-known file /tmp/.java_pid731688 is not secure: file should be owned by the current user (which is 0) but is owned by 989 at sun.tools.attach.LinuxVirtualMachine.checkPermissions(Native Method) at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:117) at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78) at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250) at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:102) at com.taobao.arthas.core.Arthas.<init>(Arthas.java:27) at com.taobao.arthas.core.Arthas.main(Arthas.java:161) [ERROR] Start arthas failed, exception stack trace: [ERROR] attach fail, targetPid: 731688
Solution:
Ensure the user running the Arthas service matches the user running DolphinScheduler to avoid this error.
Watch is used to monitor the specific execution details of methods, such as parameters and return values.
watch org.apache.dolphinscheduler.api.controller.UsersController queryUserList returnObj
[arthas@731688]$ watch org.apache.dolphinscheduler.api.controller.UsersController queryUserList returnObj Press Q or Ctrl+C to abort. Affect(class count: 1 , method count: 1) cost in 126 ms, listenerId: 2 method=org.apache.dolphinscheduler.api.controller.UsersController.queryUserList location=AtExit ts=2024-08-27 02:04:01; [cost=4.918943ms] result=@Result[ ...
Trace monitors the depth of method calls, including the methods called and the execution time of each.
[arthas@973263]$ trace org.apache.dolphinscheduler.api.controller.UsersController queryUserList Press Q or Ctrl+C to abort. Affect(class count: 1 , method count: 1) cost in 319 ms, listenerId: 1 `---ts=2024-08-27 10:33:08;thread_name=qtp1836984213-26;id=26;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@439f5b3d `---[13.962731ms] org.apache.dolphinscheduler.api.controller.UsersController:queryUserList() +---[0.18% 0.025123ms ] org.apache.dolphinscheduler.api.controller.UsersController:checkPageParams() #130 +---[0.09% 0.012549ms ] org.apache.dolphinscheduler.plugin.task.api.utils.ParameterUtils:handleEscapes() #131 `---[96.47% 13.469876ms ] org.apache.dolphinscheduler.api.service.UsersService:queryUserList() #132
To generate a heap dump file, use:
[arthas@973263]$ heapdump arthas-output/dump.hprof Dumping heap to arthas-output/dump.hprof ... Heap dump file created
Analyze the dump file with tools like MAT for memory leak diagnostics.
Use memory to inspect JVM memory usage:
[arthas@973263]$ memory Memory used total max usage heap 485M 900M 900M 53.91% ps_eden_space 277M 327M 358M 77.61% ...
Use dashboard to view CPU usage, and identify specific threads for further inspection with thread -n thread_id.
The above is the detailed content of Enhancing Task Scheduling Reliability: Integrating Arthas for API Monitoring in DolphinScheduler. For more information, please follow other related articles on the PHP Chinese website!