Using Reflective Loaders to Replace LoadLibrary for Hot Swappable Modules in C++

Disclaimer

The material in this article is provided for educational purposes only. Mapping PE images from memory, implementing reflective loaders, and building plugin-style interfaces can be powerful techniques that must only be used on systems you own or have explicit written permission to test.

By using any ideas or code patterns described here, you accept full responsibility for complying with all applicable laws, contracts, and policies. The author and publisher do not endorse or condone illegal activity and assume no liability for misuse, loss, or damages.

To practice safely:

  • Work in an isolated lab (offline VMs, snapshots, disposable accounts).

  • Keep tests scoped, logged, and authorized.

  • Prefer well-documented, maintainable implementations over “clever” evasion.

  • Treat examples as starting points, not turnkey tools; review for correctness, security, and ethics before use.

Summary

I’m going to show a practical way to map a PE DLL straight from memory and call its exported interface functions like hot-swappable plugins. The focus here is on being clear and repeatable: a stable plugin ABI, a predictable mapper, and a test flow you can run on a Friday night without cursing at your screen. To keep this article readable, longer listings stay as code blocks you can drop into your project when you’re ready for testing.


Table of Contents

  • Background

  • The Idea

  • The Journey

    • What is Reflective DLL Loading?

    • Where to start?

    • Reimplement the Technique

    • A Custom GetProcAddress

    • Mapper

    • Plugin ABI

    • Interface & exports

    • Example Module: Command Execution

    • Host Test Harness

    • Checklist for the harness:

  • Extensions

  • Going Forward

  • Conclusions

  • References


Background

I fell down this rabbit hole after a few chats about modular “hot plugin” designs (shoutout to @Jonathan Wallace’s TactixC2 threads), plus a lot of tinkering with Windows PE internals and community loaders. The endgame: ship a tiny, in-process mapper; fetch the right DLL when you need it; and call into a clean interface—rather than jumping blindly to a single entry point and hoping for the best.

This style keeps the resident footprint small and avoids spinning up helper processes for every action. In practice: the agent maps a DLL from bytes, resolves a handful of exports, constructs a plugin object, and runs execute(task) with a simple payload.


The Idea

Here’s the flow I aim for:

  1. The teamserver sends an instruction plus the matching plugin DLL bytes.

  2. The agent maps that DLL from memory (custom mapper).

  3. The agent locates exports (factory, init, exec, cleanup).

  4. The agent constructs an IPlugin and calls init → execute(task) → cleanup.

  5. Output goes wherever you’ve decided (console, buffers, whatever fits your world).

One important note: you’re replacing the loader’s mapping steps for this in-memory image. You can still lean on OS APIs to satisfy that image’s imports. Think “a custom mapper for this module,” not “banish LoadLibrary forever.”


The Journey

What is Reflective DLL Loading?

It’s basically the OS loader, but by hand: parse headers, copy sections, apply relocations, resolve imports, set protections, maybe run the DLLEntry—and do it all from a byte buffer you already have. Same puzzle, different pieces.


Where to start?

First, let's look at Stephen Fewer's implementation from more than a decade ago. First, process exports for the functions the loader needs:

Load each section at its VirtualAddress and copy its raw bytes.

Walk the import table and write resolved addresses into the IAT.

Apply relocations appropriate to the target architecture.

Set final protections for headers and sections to mirror the PE’s intent.

Optionally flush the instruction cache and, if this module depends on loader-driven init, invoke the entry point.


Reimplement the Technique

My first sanity check was a small “map and call” PoC I found in ired.team—just enough to see each step work in isolation. The big insight: once the image lives in memory, you can walk its exports yourself and call them directly.


A Custom GetProcAddress

Since the OS loader didn’t register your mapped image like a normal module, you’ll want a tiny export walker over your base: read IMAGE_EXPORT_DIRECTORY, find the name, convert RVA → VA, return the pointer. Nothing fancy—just reliable.


Mapper

A neat mapper returns the mapped base (your stand-in for HMODULE). It should:

  • Validate DOS/NT headers

  • Allocate SizeOfImage

  • Copy headers + sections

  • Apply relocations

  • Resolve imports

  • Set final protections

  • Optionally flush the instruction cache

From here, you can call your custom export resolver against the returned base.


Plugin ABI

Task payload

I like a plain C struct of char* + length pairs so modules can stay light and avoid high-level runtime baggage if they want.

Interface & exports

Keep the ABI tiny and predictable:

  • Factory: create_plugin() → IPlugin*

  • Lifecycle: plugin_init(IPlugin*), plugin_cleanup(IPlugin*)

  • Dispatch: plugin_exec(TaskApi* task)

  • Inside the module: init(), execute(TaskApi*), cleanup()

Keep the same calling convention across the interface and your host typedefs.

Next, some helper functions which I recently came to realize. This is the moment I realized where the Beacon* API functions that the cobalt strike beacon uses come from:

We need these due to the fact that the loaded DLL will execute its own code internally, but we need it to output any information to stdout. In order to achieve this, we need to create wrapper functions around the WriteFile API which displays to standard output.

Next, we create CRT-free memory allocation/deallocation helpers which are just wrappers around the HeapAlloc/HeapFree API to allocate the instance of the plugin and destroy it accordingly:

Finally, we create the export functions which the agent will receive in order to handle the DLL:


Example Module: Command Execution

Here’s a simple module that runs a command via CreateProcess with an inherited pipe, and streams stdout/stderr back through your chosen channel:

Keep this logic with the module, not in the shared header—otherwise every plugin drags command-runner baggage along for the ride. Compile this DLL using the following command:


Embedding a DLL as Bytes (for Testing)

I keep a tiny script around that turns a compiled DLL into a C/C++ byte array for fast experiments:

Executing this script with the output piped to clip:

Yields the following to your clipboard which you can save in TestDll.h:


Host Test Harness

A simple harness maps the module once, resolves exports, constructs the plugin, and dispatches a test task.

Watch out though, the variable does contain 13000+ lines.

We plug this in to our test reflective loader and we proceed to test it! We will be using the following to execute our command:

Checklist for the harness:

  • Map once and resolve all exports from that same mapped base.

  • Create/destroy via the exported functions (don’t mix in delete).

  • Pass a fully populated TaskApi* (use null+0 for absent fields).

  • Clean up handles and free buffers even on error paths.


Extensions

Once the ABI is in place, adding modules is boring—in a good way:

  • init() for per-module setup

  • execute(TaskApi*) for the actual work

  • cleanup() for teardown

The agent doesn’t need to know how you did it—only how to map, resolve, and call.

IPlugin.h

Here’s the full IPlugin.h you can standardize on:


Going Forward

If you want this to scale without pain:

  • Keep a single canonical header for the ABI and shared helpers.

  • Decide “CRT or no CRT” per module and document it.

  • Be explicit about what your mapper supports (relocs/imports/protections) and what it doesn’t (TLS, delay-load, forwarded exports, CFG, SEH tables).

  • Add targeted tests (missing imports, weird sections, map/unmap cycles). Future-you will be grateful.


Conclusions

If there’s a single lesson I’m taking away from this little adventure, it’s that “reflective loading” stops being a magic trick the moment you understand the moving parts. It’s just the PE lifecycle—headers, sections, relocations, imports, protections—played back on your own terms. That doesn’t make it trivial (I lost entire evenings to off-by-one bugs in relocation blocks and forgetting to set final page protections), but it does make it tractable. And once you can map a DLL from bytes reliably, a clean plugin ABI feels like the obvious next step rather than a leap of faith.

The second lesson is about restraint. It’s tempting to bolt every clever idea onto the mapper: TLS callbacks, delay-loads, forwarded exports, CFG tables, SEH registration, thread-local storage, the works. Ask me how I know. In practice, the small boring mapper that does the minimum well is the one you’ll trust to ship. Push anything module-specific into the modules. Keep the interface tight and predictable. When you need to extend it, extend it deliberately—prefer an extra field in your task struct over a one-off “special case” that only one module understands.

There are tradeoffs worth calling out. Compared to BOFs, reflective DLLs can feel heavier and noisier; compared to regular DLLs, they can be brittle if your mapper only supports a subset of PE features. That’s okay—just be explicit about your contract. Document what your loader guarantees (relocs/imports/protections) and what it doesn’t (TLS, delay-load, forwarded exports). If a module needs something exotic, either teach the mapper that one new trick or reject the module early with a clear error. Silent half-support is the worst failure mode..

Finally, a note on scope and responsibility. Techniques like these are powerful, and like most powerful things, they can be used in ways that help or harm. For my part, this work came out of curiosity about loaders, a desire to build smaller, more modular tooling, and a genuine love for understanding how Windows actually works under the hood. If you take anything from this post, let it be the craft: build carefully, measure honestly, write down the edges, and leave it cleaner than you found it.


References (for further reading)

Last updated