Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411517 Posts in 69377 Topics- by 58431 Members - Latest Member: Bohdan_Zoshchenko

April 27, 2024, 09:33:43 PM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsCommunityTownhallForum IssuesArchived subforums (read only)TutorialsTypesafe dynamic objects in C++ (Advanced)
Pages: [1]
Print
Author Topic: Typesafe dynamic objects in C++ (Advanced)  (Read 5838 times)
eclectocrat
Level 5
*****


Most of your personality is unconscious.


View Profile
« on: March 02, 2011, 09:50:44 PM »

The curiously recurring template pattern, or awesome compile time magic to make the C++ type system more dynamic.

Here is a scenario that is common enough to many C++ developers. You have just developed an event system that will send notifications to a list of registered observers. After careful analysis you have decided to use strings/enums/chars/porkchops as event identifiers. Perhaps you code resembles the below.

Code:
typedef std::string EventID;
namespace EnemyEvent
{
const EventID didSpawn = "Enemy.didSpawn";
const EventID didDie = "Enemy.didDie";
const EventID didMove = "Enemy.didMove;"
const EventID didAttack = "Enemy.didAttack"
const EventID didLevelUp= "Enemy.didLevelUp";
}
typedef std::function<void (EventID const&)> EventCallback;

Now however you ended up organizing them, event ID's are pretty simple to set up. Why? Because they are homogeneously typed. All event ID's are the same type and can be easily compared. Managing a single type in a statically typed language is trivial. After some careful consideration, you realize that an event ID alone is not sufficient data to act upon, you need some more! But what kind of data do you need exactly. It obviously depends on the type of event. However, events are not types, they are string/int/porkchops constants. They don't carry any information other than their identity. So let's make them types.

Code:
class EventBase 
{
public:
~EventBase ();
virtual int intValue () const;
virtual std::string stringValue () const;
virtual std::shared_ptr<Object*> objectValue () const;
virtual Point pointValue () const;
virtual Porkchop porkchopValue () const;
vitual GodzillaEggHash . . .
};
typedef std::function<void (EventBase const *)> EventCallback;

Whew. That was crazy. For every conceivable type that might be given in every conceivable event I need to add an interface to EventBase. I don't know about you but I can't see so far into the future that I can predict every type of event that the game will generate. Lets try another approach. Why not just pass the Enemy object referred to by the event!?

Code:
typedef std::function<void (EventID const&, Enemy*)> EventCallback;

This is a nice improvement because it allows us to pass a lot of information with the event. There are a few shortfalls though. Take for instance the 'didMove' event. Where exactly did the enemy move from? I mean we can get it's current location from the enemy object, but what about it's previous location? We can add a 'previousLocation' function to the Enemy class, or we can send out a 'willMove' event before the 'didMove' event. Again we suffer from the combinatorial explosion of data and types. Not ideal.

So what then should we do? Here is a pretty damn good solution:

Code:
typedef std::function<void (EventID const&, Enemy*, void*)> EventCallback;

Gross! A void pointer is like herpes. Just don't touch it. Nevertheless, it does pretty well in basic circumstances. I mean, we have the event ID, the source and at least 32bits of additional plain old data. Still, the void pointer is a big fat warning flag that says: 'I'm going to come back and fux up your life when you present this game to an investor'. Not only that, but you can only pass PLAIN data in the void pointer, no destructors or copy constructors will be called on the faceless data.

Now I'm a picky bastard and I DEMAND that my EventCallbacks can be given any type, safely and efficiently. I might want to send a shared_ptr or a PorkchopSandwich instance to my event listeners, and I have the right, gorramit!

So I hack up a data container like so:

Code:
struct Data
{
unsigned char * bytes;
unsigned int byteCount;
};

Now when generating an event I can do the following:

Code:
Data eventData;
eventData.byteCount = sizeof(std::shared_ptr<Weapon*>);
eventData.bytes = new unsigned char[eventData.byteCount];
std::memset(reinterpret_cast<void*>(eventData.bytes), 0, eventData.byteCount);
*reinterpret_cast<std::shared_ptr<Weapon*>*>(eventData.bytes) = _enemy->weapon();

// . . .
std::for_each(_observers.begin(), _observers.end(),
std::bind(&EventCallback::operator(), _1, eventID, eventSource, eventData));

// . . .
reinterpret_cast<std::shared_ptr<Weapon*>*>(eventData.bytes)->~shared_ptr<Weapon*>();
delete [] eventData.bytes;

Holy crap that's so ugly! I don't even know if it will work, and I don't want to find out! One obvious problem is that EventCallbacks may copy the event to be handled to another thread. Then when the thread unpacks the eventData to get at the shared_ptr… KABOOOM!

So after this long meandering preamble full of contrived situations, we get to the meat of the porkchop. How to make a data type containing heterogenous safely? Templates to the rescue!

First look at the code. Then marvel at the weirdness.

Code:
class DataWrapperBase 
{
unsigned char * _storage;
protected:
template<typename A>
A * rawAssign (A const& a)
{
A * init = new (_storage) A(a);
return init;
}
public:
DataWrapperBase (unsigned char * stor): _storage(stor) { }
virtual ~DataWrapperBase ( ) { }
virtual  DataWrapperBase * clone (unsigned char *, unsigned char *) = 0;
};

template <typename T>
class DataWrapper : public DataWrapperBase
{
T * _ref;
public:
DataWrapper (T const& t, unsigned char * stor):
DataWrapperBase(stor), _ref(0) { _ref = this->template rawAssign<T>(t); }
~ DataWrapper ( ) { _ref->~T(); }

DataWrapperBase * clone (unsigned char * wrapmem, unsigned char * stor) {
return new (wrapmem) DataWrapper<T>(*_ref, stor);
}
};

You might think that it looks just as ugly as the code above it. Well, it's actually kind of beautiful. Notice the complete lack of casts. Here is how it works:

DataWrapperBase acts as an interface to the essential functions that provide type safety, namely destructors and copy constructors. DataWrapper acts as an implementation of the interface for each type it is instantiated for. So now we can redefine the Data type to hold arbitrary types, safely!

Code:
struct Data
{
template<typename T>
Data (T const& t)
{
dataSize = sizeof(T);
data = new unsigned char[dataSize];
dataWrapper = new (wrapperMemory) DataWrapper<T>(t, data);
}

// . . .

Data (Data const& other)
{
data = new unsigned char[other.dataSize];
if(other.dataWrapper)
// If you follow clone you'll notice it ends up at the rawAssign
// template function. In there the copy constructor is safely
// called using the correct type.
dataWrapper = other.dataWrapper->clone(wrapperMemory, data);
}

// . . .

~Data ( )
{
if(dataWrapper)
// The destructor of DataWrapperBase is virtual, so it goes
// to the DataWrapper<T> implementation and calls T's destructor.
dataWrapper->~DataWrapperBase();

delete [] data;
}

DataWrapperBase * dataWrapper;
unsigned char * data;
unsigned int dataSize;
unsigned char wrapperMemory[sizeof(DataWrapper<int>)];
};

template<typename T>
inline T const& get_data (Data const& dat)
{
return *reinterpret_cast<T*>(dat.data);
}

In this monstrosity lies generic compile time programming, and run time type safe dynamic objects. Lets see how it works:

Data::Data<T> => DataWrapper<T>::DataWrapper<T> => rawAssign<T> => T's copy constructor.

Data::Data => DataWrapperBase::clone => DataWrapper<T>::clone => DataWrapper<T>::DataWrapper<T> => rawAssign<T> => T's copy constructor.

Data::~Data => DataWrapperBase::~DataWrapperBase => DataWrapper<T>::~DataWrapper<T> => T's destructor.

(Technically that's not how the compiler sees it, but that's beside the point).

Code:
Data eventData(_enemy->weapon());

std::for_each(_observers.begin(), _observers.end(),
std::bind(&EventCallback::operator(), _1, eventID, eventSource, eventData));

// In the event callback function . . .
std::shared_ptr<Weapon*> weapon = get_data<std::shared_ptr<Weapon*> >(eventData);

That is it! No fiddling with memory, no worries about copying or object lifetime. It's as easy as pie! Now the above object is actually a bizarre aborted beast that you shouldn't use because it's fugly, incomplete and probably has a bug somewhere in it. I provide a safe implementation below.

You might also have noticed the bizarre way that I handled the memory for DataWrapper<T>. In that lies the seeds of an awesome optimization. If you change:

Code:
unsigned char * data;

to:

Code:
unsigned char data[dataSize]; // dataSize is a const.

You might notice that you don't need any more calls to new. Hmmm.. no dynamic memory allocation, no system calls, no loops, nothing undefined. Sounds pretty sweet. In fact the above technique was used as a packaging system for a high performance realtime software synthesizer, because it is entirely non-blocking and deterministic. It was used to pass automation data to and from a realtime thread. Anyone who has worked with realtime requirements knows how crazy strict you have to be.

Here is a reasonably refined and tested version of the above, ready to use and in a single header file using only standard C++. It's called Generic. You can use it like so:

Code:
typedef ark::Generic<64> Message; // Message is 64 bytes in size.
Message msg;

msg = somePointerValue;
msg = someReferenceValue;
msg = string("this is OK");
msg = "sorry, this isn't ok"; // FAIL
msg = (void*)"I guess you can do this with static strings.";
msg = 1337;

std::ostringstream ost;
ost << msg;

msg = 0;

std::istringstream ist(ost.str());
ist >> msg;

std::cout << ark::Get<int>(msg) << std::endl; // prints 1337

Don't use streams with non-plain old data, unless you know what you're doing. And I am sure that there is some combination of templates and references that will throw up an indecipherable compiler error message. But it works very well for the most common cases.

Now going back to our original, contrived problem:

Code:
typedef ark::Generic<128> EventData;
typedef std::function<void (EventID const&, Enemy*, EventData const&)> EventCallback;

void EnemyEventOccured (EventID const& name, Enemy * enemy, EventData const& data)
{
if(name == EnemyEvent.didMove)
{
const Point currentLocation = enemy->location();
const Point previousLocation = Get<Point>(data);
// . . . do something!
}
else if(name == EnemyEvent.didAttack)
{
std::shared_ptr<Object> target = Get<std::shared_ptr<Object> >(data);
// . . . do something!
}
else // . . .
}

Tada!

As an aside, I 'invented' this technique independently when I was 19 and was super proud of myself. Then I discovered it has a name and was used by others (although not with the static memory trick, as far as I know). It's not called 'the curiously recurring template pattern', and is the basis for much template magic. It's used all throughout the boost libraries. Speaking of boost, if you are using the boost libraries in your project then PLEASE, PLEASE, for the love of [insert deity/swearword] use boost::any instead of ark::Generic. This code has worked great for me, and is running smoothly in the wild, but boost's libraries are vetted by the best C++ programmers in the world. You should trust them more than me. That being said, there are many valid reasons to not include the boost libraries, and the Generic code has NO dependencies outside the standard C++ libraries. It's also simple enough to wrap your head around and learn something.

---

Let me end with an inappropriate rant of warning which applies to both new and experienced developers. If you have any choice in the matter, don't use C or C++. I know that many of you have an emotional or professional attachment to using the C family of languages, but hear me out.

Writing good reliable software in C++ is error prone. Errors that do occur are often obscure and difficult to debug. Writing good software in C++ often requires the use of libraries that are tied to a platform or specific vendor. Using a dynamically typed language is usually much easier and more pleasant than using a strictly typed language like C++. Finally, C/C++ once had the distinction of running almost everywhere. That era is coming to a close. On locked down mobile platforms where the vendor refuses to release tools, or on the Web, C/C++ is just not welcome anymore.

Now what are some of the valid reasons to use C++? They fall into a few distinct categories:
1) Leveraging existing libraries that are unavailable in another language.
2) Speed, speed, speed. Or, deterministic execution time (realtime).
3) Learning is fun.

Just to set the record straight, I have been developing in C++ for over 14 years and I love it, I have learned when it is appropriate and when it is overkill. No hating here, just a respect for awesome power.

Good luck!
« Last Edit: March 02, 2011, 11:39:07 PM by eclectocrat » Logged

I make Mysterious Castle, a Tactics-Roguelike
Will Vale
Level 4
****



View Profile WWW
« Reply #1 on: March 05, 2011, 02:05:21 PM »

Sounds like a variant, or indeed boost::any as you point out. Watch out for alignment requirements - you need your untyped buffer to have alignof(T) as well as sizeof(T) to put a T in there.

Do you retain event data? If so I can see why you'd want to package it up, but if not can't you just pass a pointer to whatever-it-is and cast it? It's not like you've bound up the event ID to the type-safe container, so you're still doing the switch or if-else chain to determine the type.

Will
Logged
lapsus
Level 1
*


The Grey Tower


View Profile WWW
« Reply #2 on: March 05, 2011, 03:01:36 PM »

removed my rant since it as usual makes me sound like a bit of an asshole Smiley i'll save if for next time i need to nerd rage over what i think is bad c++

i do feel that the use case you solve by the code above can usually be made smaller and solved by something less generic and that avoids the whole template-bloat you end up with

and neat code, not complaining about that
« Last Edit: March 05, 2011, 03:29:41 PM by lapsus » Logged

eclectocrat
Level 5
*****


Most of your personality is unconscious.


View Profile
« Reply #3 on: March 05, 2011, 09:44:21 PM »

Sounds like a variant, or indeed boost::any as you point out. Watch out for alignment requirements - you need your untyped buffer to have alignof(T) as well as sizeof(T) to put a T in there.

Do you retain event data? If so I can see why you'd want to package it up, but if not can't you just pass a pointer to whatever-it-is and cast it? It's not like you've bound up the event ID to the type-safe container, so you're still doing the switch or if-else chain to determine the type.

True dat, just trying to keep it standardese. GCC does a pretty good job of handling alignment, and when dealing with 2/4/8 byte alignments it's hard to mess up unless you try.

Well, I like to make no assumptions of how clients will handle data, so I give them the option of retaining. The original context was in a realtime audio processor, where an observer received an event and then either handled it or pushed it onto a queue for handling by another thread. It's really about passing shared references and knowing that they will be cleaned up properly.

I freely admit the whole thing was a public C++ wank, but I remember when I was just a lad and hungry for those post intermediate techniques. Like I said above, I heartily recommend using boost or some other vetted library, but if you want to learn something neat, and maybe stash another technique in your toolbox, have fun!

removed my rant since it as usual makes me sound like a bit of an asshole Smiley i'll save if for next time i need to nerd rage over what i think is bad c++

i do feel that the use case you solve by the code above can usually be made smaller and solved by something less generic and that avoids the whole template-bloat you end up with

and neat code, not complaining about that

Dude, totally. This particular technique was so useful when passing heterogenous data over non-blocking fifo's to and from realtime contexts. The lack of blocking operations is a critical feature. I can't think of another place where I'd choose something so obtuse. Just having some fun with it!
Logged

I make Mysterious Castle, a Tactics-Roguelike
Will Vale
Level 4
****



View Profile WWW
« Reply #4 on: March 06, 2011, 12:33:19 AM »

Your other use case sounds like a good fit for this, I agree.

On alignment, I think saying "I want to keep it standard C++" and "well, it works on GCC" are somewhat at odds with one another :p

Probably worth checking it works with something with weird alignment - putting a qword or __m128 in this kind of container seems like a reasonable thing to do. Also check it still works when the container's inside another structure and not at the start Smiley

Will
Logged
eclectocrat
Level 5
*****


Most of your personality is unconscious.


View Profile
« Reply #5 on: March 06, 2011, 04:29:10 AM »

Your other use case sounds like a good fit for this, I agree.

On alignment, I think saying "I want to keep it standard C++" and "well, it works on GCC" are somewhat at odds with one another :p

Probably worth checking it works with something with weird alignment - putting a qword or __m128 in this kind of container seems like a reasonable thing to do. Also check it still works when the container's inside another structure and not at the start Smiley

You mean GCC isn't standard?  Wink

But seriously, until the new 1x standard includes alignof/alignment_of, the only requirement on alignment is that new returns memory suitably aligned for a pointer. So all the standard containers like map or list shouldn't be used for SSE types or other esoteric datas. Even Intel's own compiler had(has?) bugs regarding operator new alignment. The object should fare no worse than most wrapper/containers on alignment issues.

In the code I linked to, the class has a template argument to determine it's total size in bytes. I guess a masochist could make a 39 byte sized object, but I just assumed that someone would choose something reasonable, like a power of 2. In the use case I mentioned, a whole buffer of objects was allocated at once, and constructed in place:

const size_t object_size = 64;
const size_t object_count = 32;
unsigned char * mem = allocate(object_count * object_size);
for(size_t i = 0; i != object_count; ++i)
{
    Generic<object_size> * created = new (mem + object_size*i) Generic<object_size>();
}

It's good you mention alignment, because it's a real issue that people aught to become familiar with if they're going to push their code to it's fastest, but keep in mind that the code I gave will have precisely the same alignment issues that a std::pair<T, void*> does, and most of the time you just don't need to care.

There are definitely corner cases that I haven't covered, and you can find ways to break it, but like I mentioned before, it's more of an exhibit in template tricks than something you should plop into your codebase without thinking. Every code has it's edges and I hope I have given sufficient disclaimer. Remember the double checked locking pattern? That was being recommended for years by really competent programmers before a bug was discovered in it.

Cheers!
Logged

I make Mysterious Castle, a Tactics-Roguelike
raisedbyfinches
TIGBaby
*


View Profile WWW
« Reply #6 on: March 18, 2011, 06:38:18 PM »

I think by 'standard' you mean ANSI. Which means compiling using GCC does not mean it is. GCC allows so much random crap by guessing what it THINKS you want.
Logged

Deaf man in a city of sirens
_Tommo_
Level 8
***


frn frn frn


View Profile WWW
« Reply #7 on: March 19, 2011, 06:47:47 PM »

I just define a special pure virtual function for each event that i need to support somewhere, and tada! I have all the types I need  Durr...?

I'm really wary of these things of meta-magic when coding something as simple as a game's event system, anyway...
Logged

eclectocrat
Level 5
*****


Most of your personality is unconscious.


View Profile
« Reply #8 on: March 20, 2011, 02:50:17 AM »

I think by 'standard' you mean ANSI. Which means compiling using GCC does not mean it is. GCC allows so much random crap by guessing what it THINKS you want.

I was being sarcastic! But ultimately there is no C++ without a compiler, and every compiler tries it's best to generate good code. Your best bet is writing standard ANSI/ISO C++ and let the compiler worry about the machine

I just define a special pure virtual function for each event that i need to support somewhere, and tada! I have all the types I need  Durr...?
I'm really wary of these things of meta-magic when coding something as simple as a game's event system, anyway...

Sure, there are many good ways to manage events, but the point of this article is to explain the magic, and show how it's really very well defined behaviour, and not all that complex when you wrap your head around the details.

It's important to use techniques that you understand fully, but it's also good to learn new techniques, so that you're better prepared for future programming challenges.
Logged

I make Mysterious Castle, a Tactics-Roguelike
_Tommo_
Level 8
***


frn frn frn


View Profile WWW
« Reply #9 on: March 20, 2011, 03:32:23 AM »

Sure, there are many good ways to manage events, but the point of this article is to explain the magic, and show how it's really very well defined behaviour, and not all that complex when you wrap your head around the details.

It's important to use techniques that you understand fully, but it's also good to learn new techniques, so that you're better prepared for future programming challenges.

In fact, I'm now using this method because I first tried using "magic", and it didn't work well :D
C++ has limits that you WILL hit if you try to use it this way, and I would like more to make a game rather than understanding why a template is instanced in my DLL but does not exist in the exe...
Anyway that's just IMO, I just wanted to point out that despite the appearances, simple is better Smiley
Logged

eclectocrat
Level 5
*****


Most of your personality is unconscious.


View Profile
« Reply #10 on: March 20, 2011, 06:16:52 AM »

Every language has limits that you WILL hit if you try to use it in some way...

Fixed that for you. It's totally true, but my point is that it's not magic. It's really well defined behaviour that is used in many reliable programs. Once you get over the hump of how something works and how to use it, it ceases to be magical. The first time I used 'make' it puked up object files all over my computer, but that's not a valid reason to never use it. If an IDE suits you better, then by all means stick with the best solution for you, but if a challenge presents itself that is most efficiently solved by using 'make', then it's inefficient to find a convoluted solution using your preferred tool.

Anyway that's just IMO, I just wanted to point out that despite the appearances, simple is better Smiley
Totally true.
"Make everything as simple as possible, but not simpler" A. Einstein
Logged

I make Mysterious Castle, a Tactics-Roguelike
robert05041
TIGBaby
*


View Profile WWW
« Reply #11 on: April 01, 2011, 11:49:48 PM »

I would like to thank you all for this discussion... this was helpful for me..
Logged

Pages: [1]
Print
Jump to:  

Theme orange-lt created by panic