Was just reviewing this code since I hit a pathological edge case where it was severly impacting the performance of generating mip levels and was surprised to see the approach taken.
Couple of alarming issues:
1. It uploads vertex data for every mip level to draw a quad, when this data only needs to be uploaded once, lifting that out of the for loop is a trivial one line change but will need further changes because of the _mesa_meta_setup_texture_coords stuff, probably best to create one large vertex buffer for all of this upfront instead.
2. It allocates storage for every mip level one iteration at a time when it couple upfront allocate all the storage needed to represent all mip levels in one shot.
3. When setting minification to the source level, I'm sure this is a copy-paste bug (since it's called above to set max level but then called again), but appears to use GL_TEXTURE_MAX_LEVEL instead of GL_TEXTURE_MIN_LEVEL.
4. The comment says it does not support 3D textures yet but it appears the code does handle 3D textures.
5. The approach itself seems like it could better be done via glCopyTexImage* family of functions which already take from the read buffer, thus avoiding ugly vbo/vao state changes (as well as viewport changes) and issuing a draw call for each mip level.
Moving to i965 as it is the only driver to use this.