Mali OpenCL SDK v1.1.0
 All Classes Files Functions Variables Macros Pages
Mandelbrot

The results of calculating the Mandelbrot set produces fractal patterns.

Example Result

mandelbrot_output.bmp
Output image

The Algorithm

A complex number is in the Mandelbrot set if the it doesn't tend to infinity when a given equation is iterated on the number.

The equation used is:

zn+1 = zn2 + c

where c is the complex number and n is the iteration. z0 is set as 0.

To produce a visualization of the set, like the one shown above, you can treat the x and y pixel coordinates of an image as the real and imaginary components of a complex number. For example, at (2, 1) the complex number c = 2 + 1i.

It can be shown that if the distance from zn to the origin is greater than 2 for any value of n, it is unbounded and therefore not part of the Mandelbrot set.

Examples:

  • Not in the Mandelbrot set:
    c = 2 + 1i

    z0 = 0
    z1 = 2 + 1i

    If you use 2 + 1i as coordinates (2, 1) then the euquidean distance to the origin is sqrt(22 + 12) = 2.24.
    2.24 > 2 and therefore for this input the equation is unbounded and the input is not part of the Mandelbrot set.

  • In the Mandelbrot set:
    c = 0 + 1i

    z0 = 0
    z1 = 0 + 1i
    z2 = -1 + 1i
    z3 = 0 - 1i
    z4 = -1 + 1i

    You can see that this is forming a repeating pattern that will never become unbounded.

To create the output image, the number of iterations taken to determine whether the point is or is not part of the Mandelbrot set is used as the pixel intensity. A limit is placed on the number of iterations such that cases like c = 0 + 1i do not run forever. Therefore, a pixel which is in the Mandelbrot set will have a value equal to the maximum number of iterations.

For more details see Wikipedia.

Implementation

Unless otherwise noted, all code snippets come from the OpenCL kernel found in mandelbrot.cl.

  1. Choosing the size of the kernel

    We are using vector types in the kernel and so we are actually outputting 4 results per kernel. See below for more details of vectorising. We split the data for the real and imaginary parts in order to be able to represent it.

    We adjust the pointers into the data to reflect this:

    /*
    * Each kernel calculates 4 output pixels in the same row (hence the '* 4').
    * x is in the range [0, width] in steps of 4.
    * y is in the range [0, height].
    */
    int x = get_global_id(0) * 4;
    int y = get_global_id(1);

    And when we enqueue the kernel in mandelbrot.cpp, we reduce the worksize accordingly:

    /*
    * The OpenCL kernel calculates four pixels at a time (all in the same row).
    * Therefore, we only need to run the kernel width / 4 times in the x dimension.
    */
    size_t globalWorksize[2] = {width / 4, height};
  2. Creating the inputs

    Mali-T600 series GPU pipelines provide true IEEE-754 single-precision floating-point math in hardware. Each Mali-T600 series GPU core has a minimum of two 128-bit wide ALUs (Arithmetic logic units). Each ALU can do a maximum of two vector floating point operations per cycle (one vector floating point addition and one vector floating point multiplication).

    In this sample, the calculations use 32-bit floating point numbers. One 128-bit vector can fit four 32-bit floating point numbers. Therefore, using float4's makes maximum use of the hardware.

    We recommend the use of vectors wherever possible when using a Mali-T600 series GPU.

    Here we create the input vectors:

    /*
    * Calculate the coordinates of the four pixels in the real and imaginary space.
    * Scale the coordinates by the height and width such that the same image is produced for all values, just at different resolutions.
    * The real coordinate is scaled to the range [0, 2.5] (using '/ width * 2.5f') and then shifted to be in the range [-2, 0.5] (using '-2').
    * The imaginary coordinate is scaled to the range [0, 2] (using '/ ((float) height) * 2' and then shifted to be in the range [-1, 1] (using '-1').
    * The resulting ranges ([-2, 0.5], [-1, 1]) are the limits of the interesting parts of the Mandelbrot set.
    * Four pixels (adjacent in x (real) dimension) are calulated per kernel instance. Therefore, we calculate real coordinates for 4 pixels.
    */
    float4 initialReal = -2 + (createStartX(x) / (float)width * 2.5f);
    float4 initialImaginary = -1 + (y / (float)height * 2);
    float4 real = initialReal;
    float4 imaginary = initialImaginary;
  3. Doing the calculation

    All of the main calculations are done on vectors of 4 floating point numbers. Each vector calculation can be done as a single operation on Mali-T600 series GPU.

    The main calculation loop:

    /*
    * Loop until we can say that all pixels are not part
    * of the Mandelbrot set or the maximum number of
    * iterations has been reached.
    */
    do
    {
    iterations++;
    if (iterations > MAX_ITER)
    {
    break;
    }
    /* Backup the real value as its used in both calculations but changed by the first. */
    float4 oldReal = real;
    /* Evaluate the equation. */
    real = real * real - imaginary * imaginary + initialReal;
    imaginary = 2 * oldReal * imaginary + initialImaginary;
    /*
    * Calculate which indices to update.
    * Mathematically, if the result of the calculation becomes greater that 2, it will continue to infinity.
    * Therefore, if the result becomes greater than 2 it cannot be part of the Mandelbrot set and we stop adding to it's iteration count.
    * To get the absolute value of the calculation we have squared and added the components, therefore we must test it against
    * 4 (2 squared).
    */
    float4 absoluteValue = real * real + imaginary * imaginary;
    /* For vector input isless(x, y) returns -1 per component if x < y. */
    mask = isless(absoluteValue, (float4) 4.0f);
    /* Increase the iterations per pixel (if they are less than 4). */
    iterationsPerPixel -= mask;
    } while(any(mask));
  4. Storing the results

    Finally we convert the data to unsigned chars and store the data.

    Pixels in the Mandelbrot set will have the value of MAX_ITER.

    We use a vector store to write out all 4 results at once:

    /*
    * Write the result to the output buffer.
    * Convert the output to unsigned char to make it easier to write to a bitmap.
    */
    vstore4(convert_uchar4(iterationsPerPixel), 0, output + x + y * width);

Running the Sample

From a command prompt in the root of the SDK, run:

  1. From a command prompt in the root of the SDK, run:

    cd samples/mandelbrot
    make install

    This compiles the Mandelbrot sample code and copies all the files it needs to run to the bin folder in the root directory of the SDK.

  2. Copy this folder to the board.
  3. Navigate to the folder on the board and run the Mandelbrot binary:

  4. You should see output similar to:

    Profiling information:
    Queued time: 0.087ms
    Wait time: 0.059914ms
    Run time: 14.3642ms

    An output image should be created on the board called output.bmp.

Find solutions for Common Issues.

More Information

For more information have a look at the code in mandelbrot.cpp and mandelbrot.cl.