OK thanks for explaining that. I think I understand your situation a little better.
Have you considered performing batch sorting operations more than once, issuing more than one group of draw calls? This is the way my own code is setup. For games typically we want to submit thing to draw into a big buffer. Then we sort this buffer, which can result in multiple draw calls.
Then one more layer of abstraction can be placed onto this system, where the game can push geometry into different buffers. Each "buffer" is sorted, and then cut up into necessary batches. Each batch is rendered with a draw call. With this scheme you can fit in transparent geometry into whichever buffer you like, and they will get sorted into their own batch and submit after the previous batches. Each "buffer" could be for a major feature of the game, like the in-game UI, the pause menu, the game geometry itself, the background, and potentially other stuff as well.
This is pretty much how my own personal code works. From what I gather, your setup is similar but only has one "buffer".
The way my graphics API works is by exposing a draw call structure that contains a shader, texture references, uniforms, and the geometry to draw. These draw calls map one to one with something like glDrawPrimitives. The draw calls can be filled out in any manner of methods. The rest of the game abstracts the draw call API with a few different systems. Each system performs sorting and batching as necessary. There is not one "global system to submit geometry" to. There are a few different systems and they control 100% exactly how each draw call is formed. Basically I'm saying it's not a good idea to try and force your entire engine to generate renderable geometry through a single generic API, and instead you should expose low level APIs that get consumed by higher level APIs to expose specific features. This is called API layering. The point is to avoid generic APIs and instead expose custom and very specific ones for each feature needed.
I have one of these draw call generating systems for sprites and sprite batching, a bunch of random smaller ones for very specific shader effects, and one for full-screen post-processing effects. Each system generates draw calls very differently, and each one performs some kind of sorting + batching internally. Each one knows how to handle transparency in their own special way.
Does this help? If not I'm happy to discuss more to try and help but would need a little more info