Feature #1468

Profile and optimize CC blending algorithm

Added by GeekToo almost 9 years ago. Updated almost 9 years ago.

Status:NewStart date:2010-09-12
Priority:NormalDue date:
Assignee:GeekToo% Done:

50%

Category:-
Target version:-

Description

I did some profiling on the CC blending algorithm, and made some optimizations to it.
Ingame profiling did give unpredictable results, so I created a small test app, where the current and optimized algorithm are measured.
That did give a predictable result, where the optimized one is twice as fast as the old one.
That is on my machine of course, which is single core Intel, 32bit.
Maybe the test should be repeated on other setups:

gcc -pg test_prof.cpp
./a.out
gprof a.out > profile_result

Flat profile:

Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ns/call ns/call name
48.13 15.88 15.88 116644000 136.14 187.92 ComposeColourBlend2(unsigned int, int)
13.38 20.29 4.41 116644000 37.84 89.63 ComposeColourBlend(unsigned int, int)
13.05 24.60 4.30 696960000 6.18 6.18 unsigned int GB<unsigned int>(unsigned int, unsigned char, unsigned char)
9.25 27.65 3.05 696960000 4.38 4.38 unsigned int GB<int>(int, unsigned char, unsigned char)
6.93 29.94 2.29 929280000 2.46 2.46 int min<int>(int, int)
4.05 31.27 1.34 464640000 2.88 2.88 int max<int>(int, int)
3.34 32.37 1.10 232320000 4.74 4.74 ComposeColour(unsigned int, unsigned int, unsigned int, unsigned int)
1.74 32.95 0.58 main
0.21 33.02 0.07 gmon_start

test_prof.cpp Magnifier - updated test program (7.51 KB) GeekToo, 2010-09-12 20:21

Associated revisions

Revision 63:632bbfcc8d25
Added by GeekToo almost 9 years ago

Codechange: optimize CC blend in blitter #1468

History

#1 Updated by GeekToo almost 9 years ago

Better formatted test-sequence:

gcc -pg test_prof.cpp
./a.out
gprof a.out > profile_result

#2 Updated by GeekToo almost 9 years ago

  • File deleted (test_prof.cpp)

#3 Updated by GeekToo almost 9 years ago

Since optimized compile gives a more realistic result, and gprof icw optimized compile gave some problems, here's a different testsequence

gcc -O2 -Wall test_prof.cpp
time ./a.out

Then comment out the other function in main, and retry.

#4 Updated by Ammler almost 9 years ago

feel free to customize the build script, so you can run such tests also on our server with push or testing...

#5 Updated by GeekToo almost 9 years ago

Now the results are even better (reproduce very well too)
Algorithm as in repo:
real 0m10.055s
user 0m9.829s
sys 0m0.000s

Improved algorithm:
real 0m1.764s
user 0m1.720s
sys 0m0.004s

#6 Updated by GeekToo almost 9 years ago

  • % Done changed from 0 to 50

improvements implemented in revs 63, 65 and 66

Also available in: Atom PDF