Cobalt Strike进程注入姿势和检测思路-526互联

Cobalt Strike 3.14 finally delivered some of the process injection flexibility I’ve long wanted to see in the product. In this post, I’d like to write about my thoughts on process injection, and share a few details on how Cobalt Strike’s implementation(s) work. Along the way, I will share details about which methods you might want to use in your red team exercises.

Where does Cobalt Strike process inject?

Cobalt Strike does process injection in a few places. Some of its artifacts spawn and migrate to a new process. While these are an important part of the attack chain, they’re under your control via the Artifact Kit, Applet Kit, and Resource Kit. This post focuses on the process injection in Cobalt Strike’s Beacon payload.

The inject and shinject commands inject code into an arbitrary remote process. Some of the tool’s built-in post-exploitation jobs can target specific remote processes too. Cobalt Strike does this because it’s safer to inject a capability into a context that has the data you want vs. migrating a payload and C2 to that context.

Many of Cobalt Strike’s post-exploitation features spawn a temporary process, inject the feature’s DLL into the process, and retrieve the results over a named pipe. This is a special case of process injection. In these cases, we control the temporary process. We know the process has no purpose beyond our offense action. This allows us to do more aggressive things. For example, we can take over the main thread of these temporary processes and not worry about giving it back. This is an important detail to keep in mind when configuring process injection in Cobalt Strike.

The Process Injection Cycle

The process-inject block in a Malleable C2 profile is where you configure process injection in Cobalt Strike:

process-inject {
# set remote memory allocation technique
set allocator "NtMapViewOfSection";
 
# shape the content and properties of what we will inject
set min_alloc "16384";
set userwx    "false";
 
transform-x86 {
prepend "\x90";
}
 
transform-x64 {
prepend "\x90";
}
 
# specify how we execute code in the remote process
execute {
CreateThread "ntdll!RtlUserThreadStart";
CreateThread;
NtQueueApcThread-s;
CreateRemoteThread;
RtlCreateUserThread;
}
}

This block is organized around the lifecycle of the process injection process. Here are the steps:

1. Open a handle to the remote process
2. Allocate memory in the remote process
3. Copy the injected data to the remote process
4. Ask the remote process to execute our injected code

Allocate and Copy Data to a Remote Process

Step 1 is kind of implicit. If we spawn a temporary process (e.g., for a post-exploitation job); we already have a handle to do things to the remote process. If we want to inject code into an existing remote process (naughty, naughty), Cobalt Strike will use OpenProcess to do this.

Steps 2 and 3:

Cobalt Strike offers two options to allocate memory in a remote process and copy data to it.

The first option is the classic VirtualAllocEx -> WriteProcessMemory pattern. This is a common pattern in offense tools. This option also works across different process architectures. This matters. Process injection is not limited to an x64 process context injecting into an x64 target process. A good implementation needs to account for the different corner cases that come up (e.g., x86 -> x64, x64 -> x86, etc.). This requirement makes VirtualAllocEx a safe choice. It’s also the default Cobalt Strike uses. If you want to explicitly specify this pattern: set the process-inject -> allocator option to VirtualAllocEx.

Cobalt Strike also has the CreateFileMapping -> MapViewOfFile -> NtMapViewOfSection pattern. This option creates a file mapping that is backed by the Windows system paging file. It then maps a view of that mapped file into the current process. Cobalt Strike then copies the injected data to the memory associated with that view. The NtMapViewOfSection call makes the same mapped file (with our local changes) available in the remote target process. This is available if you set process-inject -> allocator to NtMapViewOfSection. The downside to this option is it only works x86 -> x86 and x64 -> x64. For cross-architecture injection, Cobalt Strike will fall back to the VirtualAllocEx pattern. This pattern is useful in situations where a defense solution hones in on VirtualAllocEx -> WriteProcessMemory but does not detect other methods to copy data into a remote process.

Transform your Data

The above description of steps 2 and 3 assumes that you’re copying the injected data over as-is. That’s not necessarily true. Cobalt Strike’s process-inject block has options to transform the injected data. The min_alloc option is the minimum size of the block Beacon will allocate in a remote process. The startrwx and userwx options are a hint to the initial and final permissions of the allocated memory. If you want to avoid RWX pages, set these options to false. The transform-x86 and transform-x64 blocks allow you to pad either side of the injected data. If you prepend data, make sure it’s valid code to execute for that architecture.

The options to transform content in the process-inject block are very basic. They’re basic because these are options that are safe for all injected content. If I assume what I receive is a position-independent blob that is a self-contained program, I know I am OK to prepend and append data to it at will. If I assume that this position-independent blob does not modify itself, I know I can get away without RWX permissions. These things are as far as I’m willing to go with data I know nothing about. For more aggressive changes to injected content itself, use the Malleable C2 stage block to modify Beacon. Use the Malleable C2 post-ex block to modify Cobalt Strike’s actual post-exploitation DLLs.

Don’t dismiss these transforms because they are basic though. A lot of content signatures look for specific bytes at fixed offsets from the beginning of an observable boundary. These checks occur in O(1) time which is favorable to an O(n) [or worse] search. Too many expensive checks and a security technology can run into performance issues.

Binary padding can also affect the thread start address offset of your Cobalt Strike post-exploitation jobs. When Beacon injects a DLL into memory; it starts the thread at the location of that DLL’s exported ReflectiveLoader function. This offset shows up in the thread’s start address characteristic and is a potential indicator to hunt for a specific post-exploitation DLL. Data prepended to an injected DLL affects this offset. (Less visible threads help too; we’ll get to that in a moment…)

Part 3 of In-Memory Evasion has some more discussion on content, memory, and thread characteristics that are used to detect injected DLLs in memory.

Code Execution: So many damned corner cases…

At this point, we assume our injected content is in the remote process. The next step is execute that content. This is where the process-inject -> execute block comes in. Here, you get to specify which options Cobalt Strike will consider when it needs to inject code. Beacon goes through these options, one at a time, and tries the options that are valid to the current context. When one of these options succeeds, Beacon stops this process.

I mentioned it earlier, but I want to emphasize it again: process injection is filled with corner cases. The list of options you specify has to cover these corner cases. If your list of options misses a corner case, you will find that process injection fails for seemingly random reasons. My goal with this blog post is to help clear up some of these seemingly random reasons.

What are those corner cases?

All of the injection techniques implemented in Cobalt Strike work x86 -> x86 and x64 -> x64. Injecting from one architecture into the other is a trivial base case. But, x86 -> x64 and x64 -> x86 are contexts that matter too.

One context factor (favorable, if we treat it different) is whether or not the remote process is a throw-away temporary process. Remember, Beacon’s post-ex jobs spawn a temporary process and because the process is temporary—we can do more aggressive things.

Another favorable context factor is self injection. If we inject into our own process, we can and should treat that differently. We can simply use VirtualAlloc and CreateThread when injecting into ourself. When dealing with a security stack that aggressively swat remote process injection, self-injection is a way to safely use capabilities that can target a remote process.

One last corner case is whether or not the injected data has an argument. I can pass an argument via SetThreadContext with an x64 target (thanks fastcall!). Cobalt Strike’s implementation can’t pass an argument, via SetThreadContext, with an x86 target. Bummer.

We’re not done though. When dealing with remote process injection there are other factors. Some methods are riskier on Windows XP era systems. *gasp*. RtlCreateUserThread falls into this camp. And, other methods don’t work when you have to inject across desktop session boundaries (CreateRemoteThread, I’m looking at you).

Code Execution: The perfect execute block

Some of the execute options are scoped to the special cases described above. When you specify your execute block, put these special cases (self-injection, suspended processes) first. Beacon will ignore these options when they’re not right for the current injection context.

Next, you should follow up with which methods you want Beacon to use in-general. Remember, each method has different context limitations and failure cases. If you care that your process injection succeed, OPSEC be damned, make sure you have backups to the primary methods you specify. This is how Beacon’s process injection cocktail worked before 3.14 gave control to your profiles.

Let’s walk through the different execute options implemented in Beacon and their nuances:

Code Execution: CreateThread

I’ll start with CreateThread. I think CreateThread should come first in an execute block (if it’s there at all). This function will only run when you’re doing self-injection. You can use CreateThread which will spin up a thread pointing to the code you want Beacon to run. Be cautious though. When you self-inject this way, your thread will have a start address that’s not associated with one of the modules (DLLs, the current program itself) loaded into the current process space. This is a tell used to detect injected content. To help with this, you can specify CreateThread “module!somefunction+0x##”. This variant will spawn a suspended thread that points to the specified function. If the specified function is not available via GetProcAddress; this variant will immediately fail. Beacon will use SetThreadContext to update this new thread to run your injected code. This is a way of doing self-injection in a way that gives your thread a more favorable start address.

Code Execution: SetThreadContext

The next place to go is SetThreadContext. This is one of the methods available to take over the primary thread of a temporary process spawned for a post-exploitation job. Beacon’s SetThreadContext option works x86 -> x86, x64 -> x64, and x64 -> x86. If you choose to use SetThreadContext, put it after the CreateThread option(s) in your execute block. When you use SetThreadContext; your thread will have a start address that reflects the original execution entry point of the temporary process.

Code Execution: NtQueueApcThread-s

Another option for suspended processes is NtQueueApcThread-s. This option uses NtQueueApcThread to queue a one-off function that runs when the target thread wakes up next. In this case, the target thread is the primary thread of our temporary process. This methods next step is to call ResumeThread. This function wakes up the primary thread of our suspended process. Because the process is suspended, we don’t have to worry about giving this primary thread back to the process. Supposedly, executing code this way, allows our injected capability to initialize itself in the process before some userland-resident security products initialize themselves. This method of evasion was labeled the early bird injection technique by researchers from Cyberbit. This option is x86 -> x86 and x64 -> x64 only.

The use of SetThreadContext vs. NtQueueApcThread-s are up to you. I don’t think one is clearly better than the other in all contexts.

Code Execution: NtQueueApcThread

The next option to consider is NtQueueApcThread. This is a different implementation from NtQueueApcThread-s. It’s designed to target an existing remote process. This implementation pushes an RWX stub to the remote process. This stub contains both code and context related to the injection. To execute this stub, we add our stub to the APC queue of every thread in the remote process. If one of those threads enters an alertable state, our stub will execute.

What does the stub do?

The stub first checks if it was already run. If it was, it does nothing. This is to prevent our injected code from running multiple times.

The stub then calls CreateThread with our injected code and its argument. We do this to allow the APC to quickly return and let the original thread go on about its business.

There’s a risk that no thread will wake up and execute our stub. Beacon waits about 200ms and checks the stub to determine if the code ran. If it didn’t, we update the stub to mark the injection as having run, and we move on to the next injection technique. That’s the implementation of this technique.

I’ve had several requests for this option, because some security products have less visibility into this event. That said, this implementation has its OPSEC concerns. It does push that RWX stub which itself is a noisy memory indicator. It also calls CreateThread against our code that was pushed into this remote process. The start address of this thread is not backed by a module on disk. It won’t do well with a Get-InjectedThread sweep. If you find this injection method valuable, go ahead and use it. Just be aware that it has its trade-offs. One other note: this method (as I’ve implemented it) is x86 -> x86 and x64 -> x64 only.

Code Execution: CreateRemoteThread

Another option is CreateRemoteThread. This is the standard-issue remote process injection technique. As of Windows Vista, it does fail when injecting code across session boundaries. In Cobalt Strike, vanilla CreateRemoteThread covers x86 -> x86, x64 -> x64, and x64 -> x86 cases. This technique is also very visible. The Sysmon event 8 will fire when this method is used to create a thread in another process. Beacon does implement a CreateRemoteThread variant that accepts a fake start address in the form “module!function+0x##”. Like CreateThread, Beacon will create this thread in a suspended state and use SetThreadContext/ResumeThread to make it run our code. This variant is x86 -> x86 and x64 -> x64 only. This variant will fail if the specified function is not available via GetProcAddress.

Code Execution: RtlCreateUserThread

The last option available to Cobalt Strike’s execute block is RtlCreateUserThread. Be aware! This option is similar to CreateRemoteThread without some of its limitations. It does have its own drawbacks though.

RtlCreateUserThread will inject code across session boundaries. Supposedly it has some trouble in some injection contexts on Windows XP. This may or may not matter to you. This method DOES fire Sysmon event 8 as well. One benefit to RtlCreateUserThread is it covers x86 -> x86, x64 -> x64, x64 -> x86, AND x86 -> x64. This last corner case is important to address.

x86 -> x64 injection happens when you’re in an x86 Beacon context and you spawn an x64 process for a post-exploitation job. The hashdump, mimikatz, execute-assembly, and powerpick modules all default to an x64 context where they can. To pull off the feat of x86 -> x64 injection, this implementation transitions your x86 process to an x64 mode and injects an RWX stub to call RtlCreateUserThread from an x64 context. This implemention comes from Meterpreter and the RWX stub is a loud memory indicator. I’ve long advised: “stay x64 as much as possible”. This type of detail is the reason why. I do recommend RtlCreateUserThread exist in any process-inject -> execute block though. It makes sense to have this as the bottom-most option. Use it when nothing else works.

Life without (Remote) Process Injection

When I think about how to make an offense technique flexible, I also like to give similar consideration to what would I do if this technique were not an option?

Process injection is a way to move a payload/capability to a different process context (e.g., go from desktop session 0 to desktop session 1). It’s possible to move to a different process context without remote process injection. Use the runu command. This Beacon command will execute a program as a child of an arbitrary process you specify. This is a way to get a capability into another desktop session (for example) without remote process injection.

Process injection is also a way to execute capabilities on-target without putting a capability on disk. In Cobalt Strike; many post-exploitation capabilities have the option to target a specific process. To use these without remote process injection; specify your current Beacon process. This is self-injection.

Sometimes, putting something on disk is the best option available. I once had success compiling a keystroke logger as a DLL and dropping it to c:\windows\linkinfo.dll to (eventually) load it into explorer.exe. We used an open share on the same system to periodically grab our keystrokes. This helped my colleagues and I operate in a highly-scrutinized situation where it was difficult to keep a memory-resident payload alive on target.

If you enjoy these types of thought exercises; I recommend watching Agentless Post Exploitation and Fighting the Toolset.

网上有一个翻译版：

0x01 前言

在Cobalt Strike 3.14这个版本的更新中我注意到了一些进程注入方面有了新的功能变化，深得我心。所以决定写一下我对进程注入的看法，并分享一些关于Cobalt Strike实现进程注入的技术细节，以及一些您可能希望了解的红队攻击技巧。

0x02 注入功能

Cobalt Strike目前提供了一些场景下的进程注入功能，最常见的就是直接将Payload注入到新进程中去，该功能可通过您已获取到的种种会话中去执行，比如Artifact Kit，Applet Kit和Resource Kit。本文将重点介绍了Cobalt Strike的在Beacon会话中的进程注入。

inject和shinject命令可将代码注入到任意远程进程中,一些内置的 post-exploitation模块也可通过该工具注入到特定的远程进程中。 Cobalt Strike这样做是因为将shellocde注入新会话中的会比将会话直接迁移其他C2更保险。

（大概原因是直接迁移要是新会话没拉起来，原会话已经掉了就会很尴尬。）

所以Cobalt Strike的post-exploitation在执行时都会拉起一个临时进程，并将对应payload的DLL文件注入到进程，并通过检索命名管道来确认注入的结果。当然，这只是进程注入的特例而已，通过这样的方式，我们可以放心的操作这些临时进程的主线程，而不用担心操作失误导致程序奔溃而导致权限丢失。这是在学习使用Cobalt Strike注入进程时需要了解的一个非常重要细节。

原文中提到的inject命令接的第一个参数是要注入的目标程序的PID，第二个参数是目标程序的架构，不填默认为x86。

inject 5732 x64

shinject的参数写法和inject一致，第三个参数不写的话会提示选择shellcode文件，注意需要生成的bin格式的payload。

shinject 5732 x64 /xxx.bin

除了原文中提到的两条beacon命令之外，其实还有一个shspawn也可以，其作用是启动一个进程并将shellcode注入其中
参数仅需选择程序架构即可。

shspawn x64 /xxx.bin

如图，payload被注入到rundll32.exe程序中去了，这种方式比前两种要稳定得多，不怕把程序搞奔溃。

0x03 注入流程

Cobalt Strike的Malleable C2配置文件中的process-inject 块是在配置进程注入的地方：

process-inject {
    # set remote memory allocation technique
    set allocator "NtMapViewOfSection";

    # shape the content and properties of what we will inject
    set min_alloc "16384";
    set userwx    "false";

    transform-x86 {
        prepend "\x90";
    }

    transform-x64 {
        prepend "\x90";
    }

    # specify how we execute code in the remote process
    execute {
        CreateThread "ntdll!RtlUserThreadStart";
        CreateThread;
        NtQueueApcThread-s;
        CreateRemoteThread;
        RtlCreateUserThread;
    }
}

这段代码的执行流程大致如下：

打开远程进程的句柄。
在远程进程中分配内存。
复制shellcode到远程进程。
在远程进程中执行shellcode。

step1：分配并复制数据到远程主机

第一步存在但是日常开发不太关注。如果我们拉起一个临时进程（如调用post-exploitation）; 也就是说我们已经有了远程进程的句柄，此时如果我们想将代码注入现有的远程进程...[手动狗头]，Cobalt Strike将使用OpenProcess来解决这个问题。

step2-3

Cobalt Strike提供了两个在远程进程中分配内存并将数据复制到其中的方案。

第一个方案是经典的VirtualAllocEx->WriteProcessMemory模式，这是模式在攻击工具中很常见。值得一提的是该方案也适用于不同的流程体系结构，进程注入的应用不会仅限于注入x64目标进程。这也就意味着一个好的方案需要考虑到出现的不同极端情况（比如，x86->x64，又或者x64->x86等等）。这使VirtualAllocEx成了一个相对靠谱选择，Cobalt Strike默认使用的方案也是他。如果要直接指定此模式可以把process-inject->allocator选项设置为VirtualAllocEx即可。

Cobalt Strike提供的第二种方案是CreateFileMapping->MapViewOfFile->NtMapViewOfSection模式。此方案会先创建一个支持支持Windows系统的映射文件，然后将该映射文件的视图映射到当前进程，接着Cobalt Strike会将注入的数据复制到与该视图关联的内存中，NtMapViewOfSection调用使我们的远程进程中可用的相同映射文件。如需使用该方案将process-inject->allocator设置为NtMapViewOfSection即可，这个方案的缺点是仅适用于x86->x86和x64->x64，涉及到跨架构注入的时候Cobalt Strike会自动切回到VirtualAllocEx模式。当VirtualAllocEx->WriteProcessMemory模式注入受到杀软防御时改用本方案尝试一下也是一个不错的选择。（杀软未检测将数据复制到远程进程的其他方法时非常有用。）

数据转换

上面提到步骤2和3均为假定一切正常的情况下按原始数据复制到注入的数据，真实环境中几乎不可能。为此Cobalt Strike的process-inject中加入了转换注入数据的功能，min_alloc选项是Beacon将在远程进程中分配的块的最小大小，startrwx和userwx选项是已分配内存的初始布尔值和已分配内存的最终权限。如需禁止数据可读可写可执行(RWX)，请将这些值设为false。 transform-x86和transform-x64支持将数据转换为另一架构的，如需预先添加数据，请确保它是对应架构可执行的代码。

在process-inject块中转换的内容其实是非常基础，因为这些选项对所有注入的内容都很安全。如果我假设我收到的是一个与位置无关的blob，也就是一个独立的程序，已知可以随意新添或追加数据，如果我假设这个与位置无关的blob不会自行修改，那么就可以在绕过没有RWX权限的情况下的显示。这些是我要使用但是一无所知的数据。关注点回到注入本身，Malleable C2 stage block可用于修改Beacon，Malleable C2 post-ex用于修改Cobalt Strike的post-exploitation的DLL文件。

这些是基本的转化不容忽视，许多内容签名在可观察边界的开始处以固定偏移量查找特定字节，这些检查在O（1）时间内发生，这有利于O（n）搜索，过度的的检查和安全技术可能会消耗大量内存，性能就会随之降低了。

二进制填充也会影响Cobalt Strike中post-exploitation的线程起始地址偏移，当Beacon将DLL注入内存时; 它在应该该DLL导出的ReflectiveLoader函数的位置启动线程，此偏移量显示在线程的起始地址特征中，并且是寻找特定的post-exploitation DLL的潜在指示符。注入DLL之前的数据会影响此偏移量。（不清楚线程相关的东西也没关系，下面接着会讲...）

In-Memory Evasion的第3部分讨论了用于检测内存中注入的DLL的内容，内存和线程特征。

0x04 代码执行：咋这么多该死的小众化的案例......

（译者：咋全尼玛文字描述，不见一张图，啃起来真费劲啊...）

本节我们假设我们已经将数据注入到远程进程中了，那么下一步是执行注入进来的内容了。 process-inject->execute block可以满足这个需求。开发者可以指定Cobalt Strike在需要注入代码时会考虑哪些选项，Beacon会检索一次这些选项，只要其中一个选项成功时，Beacon就会停止检索。

前面提到过但我想再次强调一下的是进程注入过程中充满了各种极端情况，您指定的选项列表必须得涵盖这些极端情况。漏掉一种都可能会导致注入失败，看起来可能会觉得进程注入因出现得错误得原因都是随机的，我在我的博客文章中也有写到如何去避免一些看似随机的错误。

有哪些不确定的案例？

Cobalt Strike中所有的注入技术都适用于x86->x86和x64->x64。从一个架构注入另一个架构看似容易得事，但是实际上x86->x64和x64->x86都需要花费不少的心思。

其中一种案例是未知远程进程是否是一个临时进程。这个问题利弊没有明确的界限，如果我们将其视为不同，则是有利的，反之有害，Beacon的post-ex模块会拉起一个临时进程，因为这个进程是临时的，所以我们可以放心的做更多的事情。

另一个有利的案例是自我自我注入，如果注入自身的进程，我们可以提前准备不同的方式来应对错误。注入自己时，我们可以使用VirtualAlloc和CreateThread，在处理远程进程注入的安全堆栈时，自我注入是一种稳妥的针对远程进程的方法。

最后一个案例是注入的数据是否有参数，这里可以通过带有x64目标的SetThreadContext传递参数（感谢fastcall！），目前Cobalt Strike的实现方案暂不能通过SetThreadContext传递带有x86目标的参数。

导致进程注入失败的未知因素远不止这些，某些方法在Windows XP系统上风险较大。RtlCreateUserThread首当其冲，当必须跨桌面会话边界进行注入时，其他方法并不起作用（CreateRemoteThread还在研究中...）。

0x05 代码执行：不存在完美的执行方案

某些执行选项的范围受限上述特殊情况，指定执行块时，首先放置这些特殊情况（自注入，挂起进程），这种方式不适合当前的注入环境时，beacon会直接将忽略这些选项。

接下来，您应该跟进了解Beacon一般使用哪些方法，遵循一个基本原则，每种方法都有其局限性，没有万能的注入方式，如果您只关心是否能够注入成功，那就打扰了。3.14之前的Beacon的在注入cocktail之前做的的事就是保证每种方法都有备份。

下面让我们一起来看看Beacon中不同执行方式之间的细微差别吧：

CreateThread

这里我从CreateThread开始讲起，我认为存在CreateThread的话它应该首先出现在一个执行块中，此功能仅在限于进行自我注入时运行。使用CreateThread将会启动指向您希望Beacon运行的代码的线程。但是要小心，当您以这种方式自我注入时，您拉起的线程将具有一个起始地址，该地址与加载到当前进程空间中的模块（DLL，当前程序本身）无关，这是一个经验之谈。为此，您可以指定CreateThread“module！somefunction + 0x ##”。这个变种将生成指向指定函数的挂起线程，如果不能通过GetProcAddress获得指定的函数; 这个变种就没有意义。Beacon将使用SetThreadContext更新此新线程以运行注入的代码，这也是一种自我注入的方式，可以为您的线程提供更有利的起始地址。

SetThreadContext

接下来是SetThreadContext，这是用于为 post-exploitation任务生成的临时进程的主线程的方法之一。Beacon的SetThreadContext适用于x86->x86，x64->x64和x64-> x86。如果选择了使用SetThreadContext，请将其放在执行块中的CreateThread选项之后，使用SetThreadContext时; 您的线程将具有反映临时进程的原始执行入口点的起始地址。

NtQueueApcThread-s

暂停进程的另一个方式是使用NtQueueApcThread-s，此方式会使用NtQueueApcThread对目标线程下次唤醒时运行的一次性函数进行列队。这种情况下，目标线程即临时进程的主线程。接着下一步是调用ResumeThread，该函数唤醒我们挂起的进程的主线程，由于此时该进程已被暂停，我们不必担心会将此主线程返回给进程。此方式仅适用于x86->x86和x64->x64。

SetThreadContext和NtQueueApcThread-s两者之间选用谁就看您自己了。大多数情况下我认为后者明显更方便。

NtQueueApcThread

下一个要考虑的方式是NtQueueApcThread，与NtQueueApcThread-s不同的是它的出现旨在针对现有的远程进程。该方法需将RWX存根推送到远程进程中，此存根包含与注入相关的代码，执行该存根需将存根添加到远程进程中每个线程的APC队列中，只要其中一个线程进入可警告状态，我们的存根代码就将被执行。

那么存根有什么作用呢？

首先存根会检查它是否已经运行，如果是就什么都不执行，防止注入的代码多次运行。

接着存根将使用我们注入的代码及其参数调用CreateThread，这样做是为了让APC快速返回并让原始线程继续工作。

没有线程会唤醒并执行我们的存根，Beacon大概会等待200ms后开始并检查存根以确定代码是否仍在运行，如果没有就更新存根并将注入标记为已经在运行，并继续下一项内容，这就是NtQueueApcThread技术的实现详情。

目前我使用过几次这种方式，因为一些安全产品对此事件的防御关注度很低。也就是说OPSEC有关注到它，它也确实是推动RWX存根的一个内存指示器，它还会针对我们推送的远程进程的代码调用CreateThread，该线程的起始地址不支持受磁盘上模块，使用Get-InjectedThread扫描效果不佳。如果您觉得这种注射方法很有价值，请继续使用它。注意权衡其利弊。值得一提的是该方式仅限于x86->x86和x64->x64。

CreateRemoteThread

另一个方式是CreateRemoteThread，从字面意思就可以了解到他是远程注入的技术。从Windows Vista开始，跨会话边界注入代码就会失败。在Cobalt Strike中，vanilla CreateRemoteThread涵盖了x86 ->x86，x64->x64和x64->x86三种情况。这种技术的动静也比较明显，当使用此方法在另一个进程中创建线程时，将触发系统监控工具Sysmon的事件8，Beacon确实实现了CreateRemoteThread的变种，它以“module！function + 0x ##”的形式接受伪起始地址，与CreateThread一样，Beacon将在挂起状态下创建此线程，并使用SetThreadContext/ResumeThread使执行我们的代码，此变种仅限于x86->x86和x64->x64。如果GetProcAddress无法使用指定的函数，则这个变种也将失效。

RtlCreateUserThread

Cobalt Strike执行块的最后一个方式是RtlCreateUserThread。此方式与CreateRemoteThread非常享受，少了一些限制，但也并非完美的，也有缺陷。

RtlCreateUserThread将在跨会话边界注入代码，据说在Windows XP上的注入时也会有很多问题，此方法同样会触发系统监控工具Sysmon的事件8。RtlCreateUserThread的一个好处是它涵盖x86->x86，x64->x64，x64->x86，以及x86->x64，最后一种情况很重要。

x86->x64注入在您处于x86 Beacon会话时开展的，并且为您的post-exploitation任务生成x64的进程，hashdump，mimikatz，execute-assembly和powerpick模块都默为x64。为了实现x86 ->x64的注入，此方式将x86进程转换为x64模式并注入RWX存根以方便从x64中调用RtlCreateUserThread，该手法来自Meterpreter，RWX存根是一个相当不错的内存指示器。我早就建议过：“尽可能地让进程呆在x64模式吧”，上述情况就是为什么我会这样说，同时也建议所有process-inject->execute block中都放一个RtlCreateUserThread，将此作为最底层的方式是有它的意义的，没有其他工作时就可以使用它。

0x06 没有进程注入的日子还怎么过

当我在考虑如何灵活的使用这些攻击技巧时，我也在想如果这些方式都行不通该怎么处理？

进程注入是将payload/capability迁移到不同进程的一种技术（比如从桌面会话0转到桌面会话1），使用runu命令就可以无需进程注入即可转移到不同的进程上，可将bot程序作为您指定的任意进程的子进程来运行。这是一种在没有进程注入的情况下将会话引入另一个桌面会话的方法。

进程注入也是一种在目标上无落地文件执行代码的方法之一。在Cobalt Strike的很多post-exploitation功能都可以选择针对特定进程发起攻击，指定当前的Beacon进程就可以无需远程注入即可使用它们，这是自我注入。

当然，无落地文件执行代码并非完美，有时候将某些东西放在磁盘上才是最好的选择，我曾经成功地将键盘记录工具编译为DLL并将其放到c:\windows\linkinfo.dll中并将其加载到explorer.exe进程。我们在同一系统上开放共享来分享定期抓获的键盘记录，有助于我和我的小伙伴们在高度审查的情况下进行操作，在这种情况下很难让payload在内存中长期活下来。

如果你对这些东西感兴趣，建议你观看Agentless Post Exploitation和Fighting Toolset。

（啃了一天，原文俚语有点多，有些地方读得懵逼了，如有翻译欠妥的地方还请师傅们扶正）