From your posted results it should be possible to improve the performance even more.
You only have 1800 grid squares but the most dense squares have over 37000 rows each. The goal is that the number of squares should be slightly larger than the number of rows in the most dense squares.
Try setting @d=20000 in my original script to get a more even distribution.