Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411500 Posts in 69373 Topics- by 58428 Members - Latest Member: shelton786

April 25, 2024, 10:02:01 AM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)General programming discussion
Pages: 1 ... 7 8 [9] 10 11 ... 16
Print
Author Topic: General programming discussion  (Read 29239 times)
JWki
Level 4
****


View Profile
« Reply #160 on: October 18, 2017, 05:00:20 AM »

I will check that opengex. But, for binary, I guess the only way is to stick with fbx.

Out of curiosity, is there a reason you require your intermediate format to be binary?
Logged
Garthy
Level 9
****


Quack, verily


View Profile WWW
« Reply #161 on: October 23, 2017, 02:01:31 PM »

Just wanted to share this blog post because I thought it was very interesting:

http://ourmachinery.com/post/multi-threading-the-truth/

I don't think this was a particularly good article.

What follows might come off as brutally critical, and I want establish that it is not my intention to be cruel. It is my intention, however, to highlight numerous significant issues with the article. My apologies to the author in any case for offense caused if they are reading. Please do not take the following personally.

The article manages to get on my nerves within the first two sentences:

"I found that there is relative little information out there on how to multi-thread actual, real-world systems"

I don't think this is even remotely true.

The deceptive title doesn't help. It seems like "The Truth" was chosen simply to make the title sound more impressive, as if the article would reveal the greater truths behind multihreading.

Anyway, the article continues, delving into ways to work around the perceived problems of multithreading by bypassing the facilities available (eg. going right to CAS), without showing much understanding of what is available.

There is talk about using spinlocks. There are two main uses for spinlocks: (i) when you are trying to get the basics going and don't care about performance; and (ii) when you are getting into the really hairy time-critical stuff and you are willing to starve everything else running on the machine.

"With a spinlock, our CPU will churn waiting for the busy thread. This consumes power and creates heat."

I barely know where to start with this one. The scheduler is going to switch out the ill-behaved process. The problem is starving everything else out. And that is going to create problems, some of which will be managed, and *maybe* at the end of the process we will use power and generate heat. But a well-behaved system that is efficient and busy is going to generate heat and consume power as well. And many systems will manage this heat by slowing the processor. It is not a matter of power consumption and heat, but of efficient use of resources.

If you're ever forced into something like a spinlock, and there are usually mechanism to avoid doing so, it should generally work like this:

- Check everything
- Does anything need updating?
  - If so: Update it.
- Yield.

That last step is important. If your thread can't do anything useful right now, give it up to one that can.

There is also a distinct lack of actual measurement, with speculation in the place of actual numbers. This covers it well:

"I have no good numbers for this and it would require a thorough investigation."

Or, instead of a thorough investigation, some basic testing of the concepts, information on the test setup, and the results.

And:

"But if you look at the cycle counts in the case of no contention they are all pretty similar. In fact, they are all pretty fast."

This is vague. Where are the numbers?

"A common multithreaded programming model is to create a number of jobs, go wide to execute them, and then go single-threaded again to collect the results."

Whilst this is one way multithreading is used, there are plenty of other ways it is used. Sometimes you have tasks that run near continuously that kick in with their contribution when they need to do something. GUI and audio management are two popular examples. Many times they sit in their own thread(s) and behave as ideal threads- sleeping and yielding most of the time, and grabbing a chunk of CPU time when they need it. This is a reasonable statement on its own, but it then becomes the focus of what follows, to the exclusion of all else.

There is talk of scheduling kicking in at the wrong time. Scheduling has to kick in eventually, as your thread and process aren't the only thing running on the machine. You can't look at a single thread in isolation and take steps to optimise its performance and dodge the scheduler because the time you take in that thread is time not spent in other threads. An analogy: Take team sports. You shouldn't spend so much time fussing over optimal performance of a single player at the cost of the others when the best overall course of action may be to pass the damn ball. You concentrate on making each player work well to ensure the best team outcome.

And on the subject of scheduling kicking in and spinlocking, the best thing you can do is to work *with* the scheduler, not against it. One way to do this is with condition variables. For those unfamiliar: These are basically a two-sided mechanism. One side is basically a way of a thread informing the scheduling system that they're stuck, but they're waiting on something, and pretty-please give us control again when the resource we have requested is ready. The other side is basically a way of saying that a thread has just provided information that another thread probably needs. They're a way to let the scheduler know what you want to happen, and how it needs to happen. It is not uncommon for a scheduler to pass control right back to the original thread that was waiting, especially since the scheduler will be aware that the thread yielded early, well before its slice was over. Some schedulers prioritise on how little time a thread has used. Yield early and yield often.

Now search the article for the term "condition variable", this very important tool for assisting a scheduler.

The system spoken about seems to be a mish-mash of a transaction-based system and a routing system. It is a bit vague, but that is the impression I got.

Let's look at the transaction side. A solution should be designed accordingly. Coarse-grained parallelism is generally preferred over fine-grained. So: One thread builds up the set of changes, passes ownership safely to another thread that manages them (think mutex on the pointer to data), and the result is then reported. The original thread can yield or go off and do other stuff, checking the result later.

Now for routing: If the system is genuinely shuffling a lot of data (the article is again vague on specifics), then such a system should be facilitating setting up a mechanism to allow setting up some sort of stream between the threads, and just managing the coordination issues and contention with other threads wanting access to the same data. Perhaps the system should accept large chunks of data, which the other thread can request a pointer to.

Anyway, the specifics of the actual goals and performance parameters are left vague, but this is where I'd be starting in general.

The article does mention one useful concept which is worth noting: Bulk allocation of resources (eg. memory) that are stored per-thread and can then be allocated to that thread, allowing that resource to be divvied up without locking. This is a good idea, but it's also an old idea. Search for: "glibc malloc arena" for a more advanced example.

And with all things performance-related, measurement is important. "Measure" and "measurement" are two others words to search for in the article.

There's so much more, but that's enough for now.

Some palate cleansers:

https://en.wikipedia.org/wiki/Monitor_(synchronization)
https://en.wikipedia.org/wiki/Thread_(computing)#Multithreading
https://en.wikipedia.org/wiki/Parallel_computing
https://en.wikipedia.org/wiki/Scheduling_(computing)
https://en.wikipedia.org/wiki/Granularity_(parallel_computing)

Oh, and to be clear, I don't want to discourage sharing of articles, even ones like this one. My input is specifically on the article itself, not that it was shared in the first place.
« Last Edit: October 23, 2017, 02:19:48 PM by Garthy » Logged
JWki
Level 4
****


View Profile
« Reply #162 on: October 23, 2017, 05:04:30 PM »

Just wanted to share this blog post because I thought it was very interesting:

http://ourmachinery.com/post/multi-threading-the-truth/

I don't think this was a particularly good article.

What follows might come off as brutally critical, and I want establish that it is not my intention to be cruel. It is my intention, however, to highlight numerous significant issues with the article. My apologies to the author in any case for offense caused if they are reading. Please do not take the following personally.

The article manages to get on my nerves within the first two sentences:

"I found that there is relative little information out there on how to multi-thread actual, real-world systems"

I don't think this is even remotely true.

The deceptive title doesn't help. It seems like "The Truth" was chosen simply to make the title sound more impressive, as if the article would reveal the greater truths behind multihreading.

Anyway, the article continues, delving into ways to work around the perceived problems of multithreading by bypassing the facilities available (eg. going right to CAS), without showing much understanding of what is available.

There is talk about using spinlocks. There are two main uses for spinlocks: (i) when you are trying to get the basics going and don't care about performance; and (ii) when you are getting into the really hairy time-critical stuff and you are willing to starve everything else running on the machine.

"With a spinlock, our CPU will churn waiting for the busy thread. This consumes power and creates heat."

I barely know where to start with this one. The scheduler is going to switch out the ill-behaved process. The problem is starving everything else out. And that is going to create problems, some of which will be managed, and *maybe* at the end of the process we will use power and generate heat. But a well-behaved system that is efficient and busy is going to generate heat and consume power as well. And many systems will manage this heat by slowing the processor. It is not a matter of power consumption and heat, but of efficient use of resources.

If you're ever forced into something like a spinlock, and there are usually mechanism to avoid doing so, it should generally work like this:

- Check everything
- Does anything need updating?
  - If so: Update it.
- Yield.

That last step is important. If your thread can't do anything useful right now, give it up to one that can.

There is also a distinct lack of actual measurement, with speculation in the place of actual numbers. This covers it well:

"I have no good numbers for this and it would require a thorough investigation."

Or, instead of a thorough investigation, some basic testing of the concepts, information on the test setup, and the results.

And:

"But if you look at the cycle counts in the case of no contention they are all pretty similar. In fact, they are all pretty fast."

This is vague. Where are the numbers?

"A common multithreaded programming model is to create a number of jobs, go wide to execute them, and then go single-threaded again to collect the results."

Whilst this is one way multithreading is used, there are plenty of other ways it is used. Sometimes you have tasks that run near continuously that kick in with their contribution when they need to do something. GUI and audio management are two popular examples. Many times they sit in their own thread(s) and behave as ideal threads- sleeping and yielding most of the time, and grabbing a chunk of CPU time when they need it. This is a reasonable statement on its own, but it then becomes the focus of what follows, to the exclusion of all else.

There is talk of scheduling kicking in at the wrong time. Scheduling has to kick in eventually, as your thread and process aren't the only thing running on the machine. You can't look at a single thread in isolation and take steps to optimise its performance and dodge the scheduler because the time you take in that thread is time not spent in other threads. An analogy: Take team sports. You shouldn't spend so much time fussing over optimal performance of a single player at the cost of the others when the best overall course of action may be to pass the damn ball. You concentrate on making each player work well to ensure the best team outcome.

And on the subject of scheduling kicking in and spinlocking, the best thing you can do is to work *with* the scheduler, not against it. One way to do this is with condition variables. For those unfamiliar: These are basically a two-sided mechanism. One side is basically a way of a thread informing the scheduling system that they're stuck, but they're waiting on something, and pretty-please give us control again when the resource we have requested is ready. The other side is basically a way of saying that a thread has just provided information that another thread probably needs. They're a way to let the scheduler know what you want to happen, and how it needs to happen. It is not uncommon for a scheduler to pass control right back to the original thread that was waiting, especially since the scheduler will be aware that the thread yielded early, well before its slice was over. Some schedulers prioritise on how little time a thread has used. Yield early and yield often.

Now search the article for the term "condition variable", this very important tool for assisting a scheduler.

The system spoken about seems to be a mish-mash of a transaction-based system and a routing system. It is a bit vague, but that is the impression I got.

Let's look at the transaction side. A solution should be designed accordingly. Coarse-grained parallelism is generally preferred over fine-grained. So: One thread builds up the set of changes, passes ownership safely to another thread that manages them (think mutex on the pointer to data), and the result is then reported. The original thread can yield or go off and do other stuff, checking the result later.

Now for routing: If the system is genuinely shuffling a lot of data (the article is again vague on specifics), then such a system should be facilitating setting up a mechanism to allow setting up some sort of stream between the threads, and just managing the coordination issues and contention with other threads wanting access to the same data. Perhaps the system should accept large chunks of data, which the other thread can request a pointer to.

Anyway, the specifics of the actual goals and performance parameters are left vague, but this is where I'd be starting in general.

The article does mention one useful concept which is worth noting: Bulk allocation of resources (eg. memory) that are stored per-thread and can then be allocated to that thread, allowing that resource to be divvied up without locking. This is a good idea, but it's also an old idea. Search for: "glibc malloc arena" for a more advanced example.

And with all things performance-related, measurement is important. "Measure" and "measurement" are two others words to search for in the article.

There's so much more, but that's enough for now.

Some palate cleansers:

https://en.wikipedia.org/wiki/Monitor_(synchronization)
https://en.wikipedia.org/wiki/Thread_(computing)#Multithreading
https://en.wikipedia.org/wiki/Parallel_computing
https://en.wikipedia.org/wiki/Scheduling_(computing)
https://en.wikipedia.org/wiki/Granularity_(parallel_computing)

Oh, and to be clear, I don't want to discourage sharing of articles, even ones like this one. My input is specifically on the article itself, not that it was shared in the first place.


Tbqh I can't comment on that too much because I was mostly interested in what the system they want to make threadsafe (which is called The Truth be
Hence the title of the blog post) does and don't have too much in depth knowledge about advanced multithreading.

Knowing what projects the author was involved in I'd assume though they are competent in their field so I'd be interested in what they think about your remarks. Unfortunately the blog doesn't seem to allow comments.
Logged
ferreiradaselva
Level 3
***



View Profile
« Reply #163 on: October 23, 2017, 05:18:38 PM »

there's always twitter: https://twitter.com/ourmachinery/status/919975159880019969

I agree with the criticism of Garthy. I also was expecting some deep knowledge from the article, but I found none.

Something not mentioned was about the heavy use of object manipulation by IDs. That was something that, months ago, I would like, but it's something that I'm not fond of anymore. It has its applications. The best I can think of is integration with built-in editor, which must be crash-proof, even if the user makes something stupid. But, I don't like in a game that much, because it's indirect and better be crash-able so I can detect errors immediately. That is some defensive programming that potentially hides problems. LET IT CRASH.
Logged

Garthy
Level 9
****


Quack, verily


View Profile WWW
« Reply #164 on: October 23, 2017, 08:21:28 PM »


Tbqh I can't comment on that too much because I was mostly interested in what the system they want to make threadsafe (which is called The Truth be Hence the title of the blog post) does and don't have too much in depth knowledge about advanced multithreading.

Just re-emphasising that my comments above are on the article itself, and in no way reflect on posting of the article. Personally, I enjoy links to interesting articles, which is why I followed the link in the first place.

Regarding the name "The Truth", I skimmed some other articles from that site and didn't find much else on "The Truth" as a system. If it is something the author has spoken about before, my impression of sensationalism in the title would be flawed.

Unfortunately the blog doesn't seem to allow comments.

Posting something like I have here directly (or indirectly linking it) into his blog would be a little cruel. It may put the poor guy in a position where he is effectively defending his credibility on the site he publishes his writings from a perceived attack. So it's not something I'll be doing personally. If someone else wants to, I guess that's their prerogative, but I wouldn't encourage it and would ask that anyone who did so be respectful of the author as a person. I disliked the article, but I don't know the person.

If the author becomes aware of what I've written about his article, he can read it. If he wants to respond, he do so. If he wants to ignore it, he can.

Something not mentioned was about the heavy use of object manipulation by IDs. That was something that, months ago, I would like, but it's something that I'm not fond of anymore. It has its applications.

Different systems can have substantially different value depending on the problem you are trying to solve. I'd suggest keeping it in mind like you'd keep an extra tool in your toolbox. Sure, you might never use a corkscrew, but one day you might need to extract a cork from a bottle and your trusty hammer might not be the best tool to use. Even the truly silly ones have their uses sometimes.

For example: I recently implemented a system that effectively passes around integer parameters  referenced by string IDs as strings, which involves looking up a string ID on each request, converting the integer result to a string to pass it along, and converting it back to an integer again on the other side. Such a thing would normally be ludicrously inefficient and worthy of some heavy mocking. In fact, part of the reason for this post is to make fun of myself given how ridiculous it sounds. But strangely enough, it actually ended up being an unexpectedly elegant solution to the problem I was trying to solve.

Logged
JWki
Level 4
****


View Profile
« Reply #165 on: October 23, 2017, 11:37:00 PM »


Regarding the name "The Truth", I skimmed some other articles from that site and didn't find much else on "The Truth" as a system. If it is something the author has spoken about before, my impression of sensationalism in the title would be flawed.


It's the first time they mention it but I thought the second paragraph was pretty clear on that it's the name of the system - it implements their high level object system (similiar to what you have in UE4 but not as a horrible macro/codegen mess on top of C++s object model) that other systems use to communicate. The name I suppose stems from the fact that this system effectively manages the application state which you could call the "truth" of the world state I guess.

Anyhow I'm aware you weren't critizing me and it's not like I'm not fine with what you posted, discussion is progress and all. However I think it might be a possibility that Niklas (the author) was just a bit handweavy with the details here to not confus e a reader on my level of exp in multithreading for example, and maybe some stuff got lost in translation so to speak. Again, my default stance is to assume something like that, especially with the author in question having a long history in low level coding on a AAA scale (primary author on the systems side of Stingray formlery Bitsquid engine etc).
Logged
Garthy
Level 9
****


Quack, verily


View Profile WWW
« Reply #166 on: October 24, 2017, 01:19:32 AM »



but I thought the second paragraph was pretty clear on that it's the name of the system

Yes, that was my understanding too. The fifth paragraph in my original comment expands on my thoughts on that. But who knows? We are both speculating. Let's go with your interpretation for now.

just a bit handweavy with the details here to not confus e a reader on my level of exp in multithreading for example, and maybe some stuff got lost in translation so to speak.

Personally, I was not left with that impression. It felt to me like the author was stretching into unfamiliar territory, but trying to present it as a confident understanding. The former is something that should be strongly encouraged. The latter opens you up to criticism from people familiar with the subject matter.

And I'm stealing your typo of "handweavy" as a way to describe this latter concept. The (unintentional?) pun is irresistible.

especially with the author in question having a long history in low level coding on a AAA scale (primary author on the systems side of Stingray formlery Bitsquid engine etc).

Sounds like he may be a very clever individual.
Logged
oahda
Level 10
*****



View Profile
« Reply #167 on: November 11, 2017, 02:10:36 PM »

I keep seeing this model in graphics tests, engines and so on, but where does it come from originally and how did it become a standard test model?

Logged

ferreiradaselva
Level 3
***



View Profile
« Reply #168 on: November 11, 2017, 02:24:55 PM »

The first time I saw it was in one of the Unreal new release videos.
Logged

powly
Level 4
****



View Profile WWW
« Reply #169 on: November 11, 2017, 05:19:25 PM »

Good question, there are a lot of variants of the model and many renderers use one as the standard test object but apparently there's no reference to where it originated.
Logged
JWki
Level 4
****


View Profile
« Reply #170 on: November 12, 2017, 02:53:54 AM »

I think the mitsuba renderer had it first?
Logged
Ordnas
Level 10
*****



View Profile WWW
« Reply #171 on: November 13, 2017, 04:18:17 AM »

I also saw it first on Unreal Engine, but it could be just a casuality.
Logged

Games:

oahda
Level 10
*****



View Profile
« Reply #172 on: November 19, 2017, 02:58:53 AM »

So I've been reading up a bunch of entity design with regards to cache hits and so on. Let me just see if I've gotten the right ideas here…



1) According to my understanding, we use the cache optimally if we have everything sequentially ordered and iterated over in order.

2) For a small game without complex composition it might be better to just have all entities be the same even if this means storing unnecessary data for some of them, right?

3) For more complex composition I'm getting the impression that even if we store the different component types in separate arrays it's better to make sure that the components that belong to the same entity always appear in the same index, even if this means having unused components at certain indices that belong to entities that do not use that component type?

4) It makes sense to avoid doing this for every component type and only doing it for the critical ones that need to get iterated over all the time (like components used for rendering or physics) so that we can cut down on unused slots for less common components that aren't used in any critical system like that?

5) In general I shouldn't worry about wasting a little bit of RAM if it can boost the cache hits? And I should be using these arrays as pools anyway so that I can make use of those unused indices whenever possible instead of growing the arrays anyway, I'm guessing.



So with that in mind, would a design roughly like this make sense?

Code:
using Entity = std::uint16_t;

struct ComponentsCritical
{
std::vector<Transform>   transforms;
std::vector<Renderdata>  renderdata;
std::vector<Animator>    animators;
std::vector<Physicsdata> physicsdata;
// and so on...
};

struct ComponentsNoncritical
{
std::map<Entity, Audiosource>   audiosources;
std::map<Entity, Audiolistener> audiolisteners;
// and so on...
};

Might be better to use different containers and so on, but is the general concept sound? If it is, I guess there is one question remaining, and that's if I should try to avoid branching when figuring out what critical components are actually in use for each component and if so how.
Logged

qMopey
Level 6
*


View Profile WWW
« Reply #173 on: November 19, 2017, 08:35:18 AM »

That design is perfectly acceptable.
Logged
JWki
Level 4
****


View Profile
« Reply #174 on: November 19, 2017, 10:10:33 AM »

Want to avoid branching - >  sort
Logged
oahda
Level 10
*****



View Profile
« Reply #175 on: November 19, 2017, 10:13:44 AM »

Yeah, thought so too. Sorting and grouping. But if I sort I can't just use the index in the array as the entity ID so what's the best structure? Are key-value pairs still fine then so long as I'm iterating over them continuously?
Logged

powly
Level 4
****



View Profile WWW
« Reply #176 on: November 19, 2017, 11:07:44 AM »

Yes, pairs are essentially just concatenated in memory so from the coherency point of view it's fine. But you'll probably also require access to the component based on entity ID so you'll need to store the reverse map; from entity ID to component ID.
Logged
qMopey
Level 6
*


View Profile WWW
« Reply #177 on: November 19, 2017, 12:02:59 PM »

More importantly packing arrays contiguously results in simpler code when iterating. Sparse arrays breed more complexity. It’s totally fine to use another layer of indirection to map ids to indices.
Logged
oahda
Level 10
*****



View Profile
« Reply #178 on: November 19, 2017, 12:30:19 PM »

I wonder, is there any difference cache-wise between this:

Code:
struct Entity
{
Transform transform;
Renderdata renderdata;
Animator animator;
Physicsdata physicsdata;
}

std::vector<std::pair<IDEntity, Entity>> components;

and this?

Code:
std::vector<std::pair<IDEntity, Transform>> transforms;
std::vector<std::pair<IDEntity, Renderdata>> renderdata;
std::vector<std::pair<IDEntity, Animator>> animators;
std::vector<std::pair<IDEntity, Physicsdata>> physicsdata;

Is one better than the other and if so why? I'm finding the former opens up for a lot of nice opportunities design-wise. Since I'm doing the thing where I'm storing critical components even for entities that don't use them in this case, the first solution makes sense too, as even in the second version, there would always be an allocation for every component type with the same entity ID, so the amount of data/memory in use is the same in each case…

It's just the data layout I'm wondering about. In the former absolutely everything is contiguous but components of the same type aren't right next to each other, while in the latter it's split up into one array for each type but components of the same type right next to each other. Is the former a problem or does it still work cache-wise? Any input?
Logged

JWki
Level 4
****


View Profile
« Reply #179 on: November 19, 2017, 12:37:30 PM »

I wonder, is there any difference cache-wise between this:

Code:
struct Entity
{
Transform transform;
Renderdata renderdata;
Animator animator;
Physicsdata physicsdata;
}

std::vector<std::pair<IDEntity, Entity>> components;

and this?

Code:
std::vector<std::pair<IDEntity, Transform>> transforms;
std::vector<std::pair<IDEntity, Renderdata>> renderdata;
std::vector<std::pair<IDEntity, Animator>> animators;
std::vector<std::pair<IDEntity, Physicsdata>> physicsdata;

Is one better than the other and if so why? I'm finding the former opens up for a lot of nice opportunities design-wise. Since I'm doing the thing where I'm storing critical components even for entities that don't use them in this case, the first solution makes sense too, as even in the second version, there would always be an allocation for every component type with the same entity ID, so the amount of data/memory in use is the same in each case…

It's just the data layout I'm wondering about. In the former absolutely everything is contiguous but components of the same type aren't right next to each other, while in the latter it's split up into one array for each type but components of the same type right next to each other. Is the former a problem or does it still work cache-wise? Any input?

Rule of thumb: keep data that is processed in batches next to each other in memory. If you are processing each component type as a batch, you should store them in a batch.
Logged
Pages: 1 ... 7 8 [9] 10 11 ... 16
Print
Jump to:  

Theme orange-lt created by panic