i've been looking into the documentation of allegro and found out that allegro 4.4.2 has hardware support right out of the box, but the drawing functions need to be rewritten to take advantage of this. just forcing the engine to run in an accelerated mode (GFX_DIRECTX for example) does nothing performance-wise.
As far as i have understood, a virtual screen, bigger than the actual camera space, is needed to do fast VRAM operations such as blending, hardware scrolling, hardware stretch, drawing polygons and so on. the maximum size of such virtual screen depends on the VRAM available.
aside from the usual request of implementing hardware support for the sake of modding, using it for mode 7 could make it operate fast and smooth. when i think of the current stretching algorythm, it sure makes the engine choke when you are trying to scale some moderately large sprites at the same time. in mode 7 every line of the ground has a different scale, smaller towards the horizon, and all sprites are scaled in the same fashion. we have yet to see how it performs, but we might want to try porting to the native 4.4.2 hardware support, since it does not require all the code to be rewritten, just the drawing functions.