jvm attach过程与底层实现

发布时间 2023-08-02 21:51:44作者: Geraltz'Rivia

rasp的技术重点之一是java-agent技术,通过agent可以获取到Instrumentation接口的实现,通过这个inst变量对字节码进行修改。

javaagent可以在jvm启动时使用 -agentjar 参数启动,也可以在运行时通过attach相应进程,并且指明需要加载的jar包,就可以进入到jar包中定义好的agentmain方法处,执行相关的逻辑。

后续分析的源码均来自 openjdk8,不同版本可能实现不同。


Attach侧

下面以jdk中提供的attach接口为例,说明整个attach的过程。

下面的代码可以把一个jar包attach到指定的jvm进程:

        String agentFilePath = "/Desktop/MyFirstAgent/target/MyFirstAgent-1.0-SNAPSHOT-jar-with-dependencies.jar";
        String applicationName = "MyApplication";

        //iterate all jvms and get the first one that matches our application name
        Optional<String> jvmProcessOpt = Optional.ofNullable(VirtualMachine.list()
                .stream()
                .filter(jvm -> {
                    System.out.println("jvm:{}" + jvm.displayName());
                    return jvm.displayName().contains(applicationName);
                })
                .findFirst().get().id());

        if(!jvmProcessOpt.isPresent()) {
            System.err.println("Target Application not found");
            return;
        }
        File agentFile = new File(agentFilePath);
        try {
            String jvmPid = jvmProcessOpt.get();
            System.out.println("Attaching to target JVM with PID: " + jvmPid);
            VirtualMachine jvm = VirtualMachine.attach(jvmPid);
            jvm.loadAgent(agentFile.getAbsolutePath());
            jvm.detach();
            System.out.println("Attached to target JVM and loaded Java agent successfully");
        } catch (Exception e) {
            throw new RuntimeException(e);
        }

代码中运行的java进程中选择jvm描述中含有我们指定类名的java进程,然后attach,并且把指定的agent load到jvm中.

VirtualMachine.attach(jvmPid)

jvm中执行attach的是由不同的AttachProvider实现的,不同的provider与系统平台有关,它们都是provider的具体实现类,执行attachVirtualMachine方法,传递对应的pid参数,在mac系统上,是BsdAttachProvider执行相关函数

会执行 new BsdVirtualMachine,跟踪进入,可以看到attach的逻辑:

  1. 先检查两个参数不为null,并将字符串格式的pid转为integer类型
  2. findSocketFile,去tmp目录下,寻找有无 .java_pid的文件,如果不存在,就在tmp目录下创建.attach_pid文件并且调用native方法 createAttachFile,该方法会在对应pid的工作目录("/proc/" + pid + "/cwd/" + fn)或者临时目录下创建.attach_pid文件
  3. 如果java_pid文件存在,或者经过创建之后,调用native方法sendQuitTo jdk/src/solaris/native/sun/tools/attach/LinuxVirtualMachine.c
/*
 * Class:     sun_tools_attach_LinuxVirtualMachine
 * Method:    sendQuitTo
 * Signature: (I)V
 */
JNIEXPORT void JNICALL Java_sun_tools_attach_LinuxVirtualMachine_sendQuitTo
  (JNIEnv *env, jclass cls, jint pid)
{
    if (kill((pid_t)pid, SIGQUIT)) {
        JNU_ThrowIOExceptionWithLastError(env, "kill");
    }
}
  1. 然后进入一个循环,每200ms尝试一次,判断tmp目录 .java_pid文件是否存在,循环时间由 System.getProperty("sun.tools.attach.attachTimeout") 规定
  2. 存在java_pid文件后,初始化socket并与这个文件连接,调用的都是native方法,分别是socket 与 connect

jdk/src/solaris/native/sun/tools/attach/LinuxVirtualMachine.c

/*
 * Class:     sun_tools_attach_LinuxVirtualMachine
 * Method:    socket
 * Signature: ()I
 */
JNIEXPORT jint JNICALL Java_sun_tools_attach_LinuxVirtualMachine_socket
  (JNIEnv *env, jclass cls)
{
    int fd = socket(PF_UNIX, SOCK_STREAM, 0);
    if (fd == -1) {
        JNU_ThrowIOExceptionWithLastError(env, "socket");
    }
    return (jint)fd;
}

/*
 * Class:     sun_tools_attach_LinuxVirtualMachine
 * Method:    connect
 * Signature: (ILjava/lang/String;)I
 */
JNIEXPORT void JNICALL Java_sun_tools_attach_LinuxVirtualMachine_connect
  (JNIEnv *env, jclass cls, jint fd, jstring path)
{
    jboolean isCopy;
    const char* p = GetStringPlatformChars(env, path, &isCopy);
    if (p != NULL) {
        struct sockaddr_un addr;
        int err = 0;

        addr.sun_family = AF_UNIX;
        strcpy(addr.sun_path, p);

        if (connect(fd, (struct sockaddr*)&addr, sizeof(addr)) == -1) {
            err = errno;
        }

        if (isCopy) {
            JNU_ReleaseStringPlatformChars(env, path, p);
        }

        /*
         * If the connect failed then we throw the appropriate exception
         * here (can't throw it before releasing the string as can't call
         * JNI with pending exception)
         */
        if (err != 0) {
            if (err == ENOENT) {
                JNU_ThrowByName(env, "java/io/FileNotFoundException", NULL);
            } else {
                char* msg = strdup(strerror(err));
                JNU_ThrowIOException(env, msg);
                if (msg != NULL) {
                    free(msg);
                }
            }
        }
    }
}

socket函数使用 int fd = socket(PF_UNIX, SOCK_STREAM, 0); 初始化了一个 uds

connect函数使用 connect(fd, (struct sockaddr*)&addr, sizeof(addr)) 建立连接

jvm.loadAgent(agentFile.getAbsolutePath())

VirtualMachine定义了一个抽象方法loadagent,具体实现是下面的实现类做的,具体的说,是执行了 this.execute("load", var1, var2 ? "true" : "false", var3) 这个方法,其中var1是传入的字符串 "Instrument", var2 是false, var3是jar包的路径

execute也是个抽象方法,具体实现依赖平台,在Bsd实现中,会把刚才的socketfile建立连接,然后先写入一个字符串"1",然后把上面的参数写进socket

"1"是jvm规定的ATTACH_PROTOCOL_VER,在hotspot/src/os/linux/vm/attachListener_linux.cpp read_request 方法中有注释对指令进行了解释

// The request is a sequence of strings so we first figure out the

// expected count and the maximum possible length of the request.

// The request is:

// 00000

// where is the protocol version (1), is the command

// name ("load", "datadump", ...), and is an argument

执行完成后,读取返回值,判断load是否成功,返回的var13是包装的socket

            if (var7 != 0) {
            String var8 = this.readErrorMessage(var13);
            var13.close();
            if (var7 == 101) {
                throw new IOException("Protocol mismatch with target VM");
            } else if (var1.equals("load")) {
                throw new AgentLoadException("Failed to load agent library");
            } else if (var8 == null) {
                throw new AttachOperationFailedException("Command failed in target VM");
            } else {
                throw new AttachOperationFailedException(var8);
            }
        } else {
            return var13;
        }

目标jvm侧

SIGQUIT信号处理

在 hotspot/src/share/vm/runtime/os.cpp 中 方法 signal_thread_entry 对系统的信号进行处理,在收到SIGQUIT信号时,会先去进行attach判断(!DisableAttachMechanism && AttachListener::is_init_trigger()),如果检查不通过就去打印栈上的trace

// SIGBREAK is sent by the keyboard to query the VM state
#ifndef SIGBREAK
#define SIGBREAK SIGQUIT
#endif

// sigexitnum_pd is a platform-specific special signal used for terminating the Signal thread.


static void signal_thread_entry(JavaThread* thread, TRAPS) {
  os::set_priority(thread, NearMaxPriority);
  while (true) {
    int sig;
    {
      // FIXME : Currently we have not decieded what should be the status
      //         for this java thread blocked here. Once we decide about
      //         that we should fix this.
      sig = os::signal_wait();
    }
    if (sig == os::sigexitnum_pd()) {
       // Terminate the signal thread
       return;
    }

    switch (sig) {
      case SIGBREAK: {
        // Check if the signal is a trigger to start the Attach Listener - in that
        // case don't print stack traces.
        if (!DisableAttachMechanism && AttachListener::is_init_trigger()) {
          continue;
        }
        // Print stack traces
        // Any SIGBREAK operations added here should make sure to flush
        // the output stream (e.g. tty->flush()) after output.  See 4803766.
        // Each module also prints an extra carriage return after its output.
        VM_PrintThreads op;
        VMThread::execute(&op);
        VM_PrintJNI jni_op;
        VMThread::execute(&jni_op);
        VM_FindDeadlocks op1(tty);
        VMThread::execute(&op1);
        Universe::print_heap_at_SIGBREAK();
        ...
        break;
      }
			...
    }
  }
}

在 is_init_trigger 方法中,会检查tmp目录下是否存在 .attach_pid%pid 这个文件,检查文件创建的用户与当前jvm进程effective user相同,执行AttachListener的init方法

// If the file .attach_pid<pid> exists in the working directory
// or /tmp then this is the trigger to start the attach mechanism
bool AttachListener::is_init_trigger() {
  if (init_at_startup() || is_initialized()) {
    return false;               // initialized at startup or already initialized
  }
  char path[PATH_MAX + 1];
  int ret;
  struct stat st;

  snprintf(path, PATH_MAX + 1, "%s/.attach_pid%d",
           os::get_temp_directory(), os::current_process_id());
  RESTARTABLE(::stat(path, &st), ret);
  if (ret == 0) {
    // simple check to avoid starting the attach mechanism when
    // a bogus user creates the file
    if (st.st_uid == geteuid()) {
      init();
      return true;
    }
  }
  return false;
}

hotspot/src/share/vm/services/attachListener.cpp

// Starts the Attach Listener thread
void AttachListener::init() {
  ...
  { MutexLocker mu(Threads_lock);
    JavaThread* listener_thread = new JavaThread(&attach_listener_thread_entry);
    ...
    Thread::start(listener_thread);
  }
}

attach_listener_thread_entry 也是attachListener.cpp文件中的函数,根据注释,该函数初始化AttachListener,从一个队列里获取 operation,然后根据op的类型,派发相应的处理函数执行操作。

AttachListener

根据平台不同有不同的AttachListener实现,以hotspot/src/os/linux/vm/attachListener_linux.cpp为例,其他平台实现思路应该是相同的,细节方面可能有所差异。

初始化

上文中说到AttachListener的init方法,看一下LinuxAttachListener的初始化过程,只做了一件事,就是在tmp目录下新建 .java_pid 文件,然后新建了一个unix domain socket的服务端设置为监听,设置socket文件权限为可读写。

执行到这里,相当于从attach侧发送到attach请求,已经得到了jvm侧回应,建立好了socket连接。

之后通过AttachListener::dequeue(); 取出命令并使用相应的函数处理。

dequeue

dequeue方法是一个死循环,会循环使用accept方法,接受socket中传过来的数据,并且在验证通信的另一端的uid与gid与自身的euid与egid相同后,执行read_request方法,从socket读取内容,并且把内容包装成AttachOperation类的一个实例。

read_request方法规定了发送的内容,00000,把数据封装为一个vm操作,每种操作有不同的处理函数处理

支持的操作

static AttachOperationFunctionInfo funcs[] = {
 { "agentProperties", get_agent_properties },
 { "datadump", data_dump },
#ifndef SERVICES_KERNEL
 { "dumpheap", dump_heap },
#endif // SERVICES_KERNEL
 { "load", JvmtiExport::load_agent_library },
 { "properties", get_system_properties },
 { "threaddump", thread_dump },
 { "inspectheap", heap_inspection },
 { "setflag", set_flag },
 { "printflag", print_flag },
 { NULL, NULL }
};

load_agent_library

如果希望注入agent,就需要发送 "1" "load" "instrument" "false" ""给socket,后面三个参数会传入到load_agent_library函数中处理

false表示使用非绝对路径,函数会去找到对应的dll并加载,在macos上,找到的是jre/lib/libinstrument.dylib,也就是 lib路径+“参数”+“.dylib”

如果ddl加载成功了,就去里面寻找Agent_OnAttach方法,调用执行,如果执行成功就把这个agent加入到代理列表中

这里的Agent_OnAttach是 jvmti 定义的方法,更多有关jvmti的信息,可参考 https://docs.oracle.com/en/java/javase/20/docs/specs/jvmti.html

实现在 jdk/src/share/instrument/InvocationAdapter.c

JNIEXPORT jint JNICALL

Agent_OnAttach(JavaVM* vm, char *args, void * reserved)

JPLIS stands for Java Programming Language Instrumentation Services

后续分析需要涉及jvmti的实现,以及上面说的JPLIS,可参考 https://blog.csdn.net/sun_tantan/article/details/105786883