The first and most important thing you need to know is that data access times vary. For those unaware, memory refers to the main memory or RAM (Random Access Memory) of a computer. What you most likely think of as memory is actually your hard disk/hard drive (HDD), this is a somewhat common misconception. For the computer to actually make use of data it actually has to have the data in the processor. This means that the data has to travel through various levels of the memory hierarchy. The memory hierarchy is from slowest to fastest access times roughly: External Storage (disk, magnetic tape, etc) -> HDD –> Main Memory (RAM)-> Video Memory (VRAM) -> Processor Cache L3/L2 -> Processor Cache L1 -> Processor Registers. All of this data is transferred between the various levels of the memory hierarchy. External storage access is at least several orders of magnitude (OoM) slower than hard disk access. Hard disk access is in turn several OoM slower than main memory, which in turn is slower than the processors cache, which is slower than the processor registers. An order of magnitude is a change by a power of 10 (powers of 2 are common in computing though time is measured with the base 10 or decimal system just like normal). For example 10^1 = 10 and 10^2 = 100, therefore the difference between 10^1 and 10^2 is one order of magnitude, so the difference between 10 and 1,000,000 is five orders of magnitude. This is why on the Xbox 360 when you install a game to the HDD it reduces loading times, when compared to having to pull the data from the disk. Once data reaches main memory it tends to stay there until it is no longer needed, so as to avoid having to access the slower parts of the memory hierarchy, and is copied to the processor or the video card depending on the type of data.
Now I’ll discuss the rendering (drawing to the screen) process, albeit very simplified. Once you have the data on the video card’s memory the video card performs the necessary processes to manipulate the data and draw an image on your screen. The data first goes to through the vertex shader where vertex information is processed. A vertex is a point connecting two or more lines generally storing at a minimum a specific coordinate in the X,Y, Z planes (3-Dimensional Space). A triangle has 3 vertices, one at each corner. A square can be made of two triangles placed together. Because they are placed together the two triangles share two vertices in common, thus this data is redundant in one triangle and we can eliminate it. Therefore, a square is made up of 4 vertices instead of 6. All 3D models are simply various combinations of millions of individual triangles. The data then goes through the rasterization process which transforms the 3D model data into a 2D image made of pixels. A pixel is simply a point, usually holding a coordinate in the X, Y planes (position in 2-Dimensional Space) and some color information. This is then fed through pixel shader where various operations, are performed to generate visual effects. From here the final information goes to the frame buffer where it is stored until it is time to draw the entire scene onto the screen.
Now that that is out of the way I’ll explain the probable reason for the improved performance by merging cuirass and greaves. When you go to render something you need to make a call to the graphics API (in the case of Xbox 360 and usually in Windows this is Microsoft’s DirectX, while the Playstation 3 uses OpenGL) Let’s pretend I have a model of a dog. When I tell the API to draw the dog on the screen, It first checks to see if the data for the dog is already in video memory, if so good the above process proceeds. If not that means I have go through the memory hierarchy, this could be fast or slow depending on if the information is already in main memory, to get the needed data and send it to the video memory. If there Is nothing in video memory all is well and we proceed as normal. However, if video memory is full, or there is not enough free memory available, we have to unload data from video memory to make room for the new data. Naturally, this creates problems if we are trying to draw many different things in one frame that do not use the same information. Drawing two dogs is easy and only needs enough memory to store the information of one dog. Drawing a dog and a cat is more complex and requires the information of the both the dog model and the cat model to be in video memory when it goes draw them. Keep in mind, however, that there are limits to what the video card can do at one time, so that while we can easily draw 2 or 5 dogs on screen with only one model in video memory, it is much more difficult to draw 50 or 100 dogs on screen even though we still only have the one model in video memory. Now, what if I want to draw a dog, a cat, and a mouse on the screen (in that order) but I only have enough video memory to store the dog and the cat. First the dog model and cat model are loaded into video memory and drawn to the frame buffer. Then some information is unloaded (whether it’s the dog or the cat model is unimportant at this point, but we'll say the dog) and the mouse model is then loaded into memory and drawn to the frame buffer. Finally, now that everything that needs to be there in this frame is drawn to the frame buffer, the image in the frame buffer is then sent to the screen. What if I want to draw two dogs, a cat, and a mouse? Well it’s much the same, if I’m smart I draw both dogs one after the other. If I however draw a dog followed by a cat and a mouse, and then the second dog I have a problem. Now I have to load the dog data and the cat data into video memory and then render them, I then unload the dog data and load the mouse data and render it, I then have unload the cat data and reload the dog data into video memory before rendering the second dog. This naturally takes more time than if I had drawn both dogs at the same time.
Wait, what does this all have to do with the cuirass and greaves? Well remember the Xbox 360 doesn’t have separate video memory, instead main memory is shared and a portion of it is dedicated as video memory. This means that the Xbox actually has less than 512 MB main memory, which in turn means more having to fetch data from slower parts of the hierarchy. Most likely when a scene is rendered each character and all of their equipment is rendered at one time for logical reasons. This means that to a render a group of bandits in a cave next to a campfire. I first have to render the cave, I then render their campfire. I next render the first bandit, a Nord in full leather armor with an axe. First I render his body, head and hair likely separate meshes. I then render his axe. Next I go to render his helmet, gloves, boots, cuirass, and finally greaves. Unfortunately all this is too much information to fit in the memory I have allocated to the video card. So before I render the cuirass I have to remove the cave and fire from memory and load in the data for the cuirass and render it. Next I unload the body and head to fit the greaves and render it. Finally, I unload the hair and load the boots to render them and so forth. I then go to draw the next bandit repeating the steps as necessary. If I combine the cuirass and greaves I have data that is roughly the size of the cuirass data + the size of the greaves data, but there is a difference. While this bit of data is larger, it only requires fetching once, as opposed to one time for each separate piece. By combining them we reduce the number of times we have to fetch more data into video memory and speed up rendering of the whole scene. This in turn allows us to render more things in a scene in the same amount of time and thus have more complex scenes. Keep in mind that this is an incredibly simplified example, but explains the issue of multiple render calls to the graphics card, and why reducing them, which in my mind is the most likely cause, results in a performance gain from the merging.