As blitting in pygame turned out to be not enough, it’s time to see how pygame’s experimental SDL2 module compares. When comparing FPS between the two approaches, it’s important to keep in mind the non-linear nature of FPS. Also, worth noting that as of writing this SDL2 module in pygame is still in development/experimental.
Pygame’s blitting system is quite easy to use but it is CPU based: surface1.blit(surface2)
. SDL2 is done on GPU and therefore can be much faster and looks like this:
texture = pygame._sdl2.video.Texture.from_surface(renderer, surface)
texture.draw(dstrect=(200, 150, 400, 300))
It took roughly a day to convert my prototype to use SDL2, it’s fairly easy to use. I managed to get 130 FPS wth 6500 NPCs.
This is a remarkable improvement when comparing against blitting 2x less NPCs (3000+) and getting 2x worse performance (75+ FPS). cProfiler’s top offender seems to be drawing on texture indicating that there isn’t any Python logic aside from rendering slowing us down.

Note that draw_on_texture()
is basically SDL2’s texture.draw()
.
On a surface level… this seems workable, however, once I started adding moving logic (instead of “teleporting” NPCs) from one position to another, a bit of animation, path-finding and collisions, the FPS dipped way below 60. Yes, you can optimize in Python. Yes, you can Cythonize your .pyx files to get that nice performance from C, but it’s too early! There will be ample reasons down the road to optimize at this level once I start to add heavy game logic for simulating the sandbox environment.
As a thought exercise, I commented out draw() line so it renders nothing, keeping the existing logic from my “6000 NPCs for 130FPS” test. The improvement was predictable but still staggering:
Yes, it’s not linear but 910 FPS from 130 FPS is 6.6m/s frame-to-frame improvement – all that for rendering. For reference, going from 30 to 120 FPS, you get 16.6 m/s gain.
Rendering using SDL2 is a big improvement, but I still feel that it doesn’t leave me much room to maneuver down the line. It’s faster because it’s GPU rendering but it’s still drawing each texture separately even if it’s the same texture drawn multiple times. Of course, I’m alluding to instanced rendering which unfortunately is not supported in this SDL2 module. This brings me to explore OpenGL next. It should allow for ultimate flexibility using shaders. More on that later.