While I agree that it would be cool, there are several reasons why (at least in oblivion's engine) it wouldn't work. The main one is that with the engine there is no feasible way to make it work unless all the cave were specifically on the cliffs/ridges of hills and mountains, which there were not many of. Even then they would have to have a hole in the ground which (to my knowledge from using the Construction Set) could not be done, as the ground is from a technical standpoint, a single mesh. Along with that there was the fact that enemies could easily detect through thin walls would make sneaking in cities very challenging. Finally, think about all the models from house interiors to random loot to NPCs. Now take all of that from EVERY building in Bruma and try and load it all at once. It won't work too well.
This is all (basically as you said) based on Oblivion's engine. And you can't compare a 5 year old engine with something like this today.
Firstly, I expect caves to not be in the middle of a forest as in Oblivion. They should be in more logical places. Near mountains or hills. Or if they are in a forest, at least made believable by not having them as a simply big, lonely rock with a door in it.
Secondly, I have no idea how to do a hole in a ground, lol. But my guess is that after 5 years, technology has developed so far that they know how to dig a hole
Thirdly, AI can know when there is something in the way between you and an NPC. This was done in Oblivion. Archers didn't shoot if you were behind a building. Same theory can be implemented here.
Fourhtly, I don't know about this. Gamebryo handled lots of items
horribly. I've played other games with "real open" cities that have "real open" houses. No problems there. Shouldn't be any reason why Bethesda can't do this either then.