I have a plan for this! Now, I don't make video games, so I wouldn't know all the technical ins and outs, but it seems to me, that since you would be loading the interior of one building at a time, even allowing a normal load to do so, you then have a building, with windows, doors, and contents that is "realtime" with the player. There is a load in all previous games as it dumps the city and loads the much smaller interior. This would just load the single extra interior into the existing loaded city. And, as the player walks away from the single loaded interior, I don't know if a load would be required to "dump" those assets. I know a long time ago they would always have to use right angle hallways in video games, because they could not let the player have both adjoining rooms all loaded at the same time. But if the interior can be dumped on the fly, as you walk away from the building, the windows just re-opaque at a certain distance.
And, of course, they can't simply have every interior load as you near it; running along the street storefronts would load interior after interior, which would be a lot different then loading "one" interior.
BUT, if the player is initiating entering a building, load that one building to the existing surroundings, meaning after the load, the player is standing outside the unopened door of a now live interior; the door could be opened as any interior door normally opens, and windows could potentially be spied through, or even entered!
And of course, this also means you could see out from any house you are in!
Is there some technical reason this couldn't happen, for instance, if there is an issue with "dumping" the assets as the player leaves the building? Remember that an interior would only load to the existing exterior as a player was near enough to enter it, so
no more then one interior would ever need be loaded at a time.