Your idea seems good, it's alot better if you want to have lots of collision checks or fluid dynamics.
I had lots of ideas for cube-voxel rendering , so I thought that i could share them
Dunno if this will be helpful, but maybe someday someone will use it for something.
(I was/am making a voxel, multiplayer shooter, rendered with cubes and partially GL_POINT's)
I experimented with 2d array of octrees (and also with octree of 2d arrays) and both were decent, but it ended by using one big octree.
I did an octree with had a LoD level, so if something was in a long distance - it was rendered out of bigger cubes (or GL_POINTS).
Also good optimization is getting the meshes in near->far (not random) order, so the graphics card can make early z-something thing
I made rendering using displaylists, and it was rendering the map out of 'bigger' chunks.
So bassically let's say that tree has 10 levels, I set the renderLevel to 5 and lodLevel=10
and it would make me display lists with alot of cubes.
If i would set lodLevel to 6 - it would render me maximally 8 big cubes.
Also files were made in that way, they contained some lods (for ex. level 10, 8 , 6)
so I could just simply load a low resolution voxels.
I had the map saved at for ex. level 10 but i was limitting the splitting to 12, so an explosion could make really nice looking (almost worms -style) holes.
I was projecting the octree level that im curently working on to a 3d grid, and then I could optimize them (join the faces etc)
I read alot about optimizations in minecraft - and there is one, thats easy to use, and is suitable for your project :
When you are creating your mesh - split it by the direction 'where the face is looking'.
So you will have 8 meshes instead of 1 and you could render only these that are visible for the player.