Thursday, January 21, 2010

File header from my first programming assignment in 4 years

What can I say... I'm a bit of a nerd, and I guess I missed doing this stuff.
/********************
* render_point.cpp *
********************

(1) Christopher Casey
(2) CS180, Spring 2010
(3) Assignment #1
(4) Due Wednesday, January 27

SUMMARY: An x86 FPU Assembly implementation of Mandelbrot Set
calculations.

OBJECTIVE: Use inline assembler to provide a speed increase
over the following reference code:

//jsh 1/08
double a = 0.0, b = 0.0, norm2 = 0.0;
for (n = 0; norm2 < 4.0 && n < N; ++n) {
double c = a*a - b*b + x;
b = 2.0*a*b + y;
a = c;
norm2 = a*a + b*b;
}

SOLUTION NOTES:
I initially used FPU assembler to convert the body of the
loop. This saved on a few calls to the `fld` instruction as
items were copied from system memory to the FPU stack. On my
second iteration of optimization, I wrote the loop condition
in asm as well. In addition to saving a few more `fld` calls,
this allowed me to re-use the `a*a` and `b*b` values
calculated for norm2.

I've also added code that tests to see if the point is in the
cardioid or the period-2 bulb. I found a write-up about these
optimizations and decided to try implementing them. They are
available via pre-processor directives so they can be turned
off, as they only increase performance when in the default
view. These optimizations were written in C for legibility
(and because they aren't the inner-most loop). The speed
increase realized in the default view is enormous -- the
scene is rendered in about 1/5 the time.

Optimization can still be made to the loop: like the reference
code, it iterates once more than necessary when reaching max
iterations. However, since we default to 5120 iterations this
would provide a hardly noticable increase. One could also
optimize the return value by providing a look-up table rather
than performing integer-to-double-to-integer conversions to
convert `n` to a COLORREF type. Doing this would require
explicit knowledge of the value of `N`.

KNOWN ISSUES:
My solution returns a slightly different set of values for the
points than the reference code. The discrepancy is due to two
different issues. First, I believe that a minor bug exists in
the reference code. The code tests for `norm2 < 4.0`, but
typical Mandelbrot Set implementations test for `norm2 <= 4.0`.
My code corrects this.

The second, and more broad issue is that because my floating
point operations occur in a different order than the ones
generated by the reference code (and I add a+a instead of
multiplying a*2), the floating point error produces slightly
different results.

The error is slight, and I don't feel either issue requires
further investigation at this time.

BENCHMARK:
Tested on an Intel i7 920 Windows 7 64-bit, mandelbrot.cpp as driver, max 5120 iterations

Default View (lower_left_x = -1.5f; lower_left_y = -1.0f; scale = 2.0f/length;):
Reference code [/Od]: 9.38 seconds
Reference code [/O2]: 6.77 seconds
(40% speed increase over /Od)
Reference code [/Ox]: 6.53 seconds
(40% speed increase over /Od)
__asm [/Od]: 4.60 seconds
(100% speed increase over reference + /Od,
or 45% increase over /O2)
__asm (+optimizations) [/Od]: 0.80 seconds
(1000% speed increase over reference + /Od)

Alternate View (lower_left_x = -0.25; lower_left_y = 0.775; scale = 0.25/length;):
Reference code [/Od]: 7.03 seconds
Reference code [/O2]: 5.71 seconds
(20% speed increase over /Od)
Reference code [/Ox]: 5.70 seconds
(20% speed increase over /Od)
__asm [/Od]: 2.40 seconds
(190% speed increase over reference + /Od,
or 130% increase over /O2)
__asm (+optimizations) [/Od]: 2.40 seconds
(190% speed increase over reference + /Od)

REFERENCES:
Assignment details:
https://faculty.digipen.edu/~jhanson/protected/cs180assignment1.pdf
https://faculty.digipen.edu/~jhanson/protected/render_point.h
https://faculty.digipen.edu/~jhanson/protected/mandelbrot.cpp
https://faculty.digipen.edu/~jhanson/protected/render_point0.cpp

Optimization notes:
http://www.nitorijournal.org/?p=517
(Cardioid and period-2 Bulb detection)

Additional notes: (researched, but not used in this project)
http://users.softlab.ece.ntua.gr/~ttsiod/mandelSSE.html
(CUDA and SSE-based mandelbrots)
http://www.ozone3d.net/tutorials/mandelbrot_set_p4.php
(General Purpose GPU Stream processing mandelbrots)
*/

Tuesday, January 12, 2010

A letter to my professor

Professor ["C"],

In today's lecture we discussed frame rate, and they eye's ability to detect motion. I was a little concerned that some misconceptions were being fueled about the human eye's ability to detect motion. To say that the eye is "only capable" of seeing 12, 24, 30, or 60 "frames" is not correct. When I brought this up in class, you mentioned the example of a spinning wheel appearing to move backwards. I feel this example is flawed. Generally, this illusion applies to the wheel being captured on film at 24 hz, where we obviously have a stroboscopic effect causing the wheel to appear to turn backwards. It can also be observed in indoor lighting, particularly with flourescents, as the lighting itself operates at 60 hz.

There have actually been some real-world, sunlit experiments where people would perceive a wheel as rotating backwards, but I believe they are largely considered to be the result of eye fatigue, and not of some explicit limit in the brain's ability to interpret the data from the retina. I wonder if this may also be in part a result of our brains becoming so accustomed to strobic lights in modern life.

I found an interesting article on the subject:
"Illusory motion reversal is caused by rivalry, not by
perceptual snapshots of the visual field."
Keith Kline, Alex O. Holcombe, David M. Eagleman

http://wexler.free.fr/library/files/kline%20(2004)%20illusory%20motion%20reversal%20is%20caused%20by%20rivalry,%20not%20by%20perceptual%20snapshots%20of%20the%20visual%20field.pdf

Much of the article is technical neuroscience that I don't understand, but their conclusion is clear: motion reversal illusion cannot be explained by snapshots of the visual field.

Also, you posed the question in class today, "what does 'real-time' really mean?" I think it's a term that's become a little bit bastardized to mean "fast, snappy response time," but the technical definition is that a real-time system has a fixed amount of time allotted to complete a task.

Someone brought up the idea that a turn-based game like chess is not "real-time." Actually, in a tournament chess match with a hard timer, chess is a real-time game. Because the operation (selecting and executing a move) must be completed by a discrete moment, a computer opponent can be considered a real-time system. If no deadline is given, the operation can run as a simple batch process with no concern to the time it takes to complete.

Rigorously speaking, most game engines don't run in true real-time any more; it's just very important that we get really close, and handle things properly when we don't make our 16ms deadline :)