Project Skip to main content
IBM Research Homepage 
 Research Home  >> The Compiler Project for Scalable Architectures at IBM Research


mimd compilation

Content
arrow Project Abstract
arrow Cell Architecture
arrow Parallelization
arrow Simdization
  Alignment Handling
  Code Examples
arrow Optimized CodeGen
arrow Performance Eval
arrow Presentations
arrow Publications
arrow Patents

simd compilation


Related Projects 
arrow IBM Cell Research
arrow BlueGene Research

  Example of Code Handled by Simdization Framework

Example 1: A vanilla loop computing vector add.

for (j=0; i<n; j++) c[j] = a[j]+b[j];

Example 2: The loop body involves aggregate copy and computation on adjacent members of aggregates. SIMD parallelism can be extracted at basic-block level.

for (i=0; i<n; i++) {
q = quads[i];

vertex_results[i].x = W0 * q.p[0].x + W1 * q.p[1].x + W2 * q.p[2].x + W3 * q.p[3].x;

vertex_results[i].y = W0 * q.p[0].y + W1 * q.p[1].y + W2 * q.p[2].y + W3 * q.p[3].y;

vertex_results[i].z = W0 * q.p[0].z + W1 * q.p[1].z + W2 * q.p[2].z + W3 * q.p[3].z;

vertex_results[i].w = W0 * q.p[0].w + W1 * q.p[1].w + W2 * q.p[2].w + W3 * q.p[3].w;

}

Example 3: This loop contains arbitrary combination of misalignments and unknown loop bounds.

for(i=lowBound; i<highBound; i++) {
vout0[i+sindex3] = in0[i+sindex0] + in1[i+sindex2 + in2[i+sindex2] + in3[i+sindex3];
vout1[i+sindex2] = in5[i+sindex0] + in7[i+sindex0] + in4[i+sindex1] + in6[i+sindex3];
vout2[i+sindex2] = in10[i+sindex1] + in11[i+sindex2] + in8[i+sindex3] + in9[i+sindex3];
vout3[i+sindex1] = in13[i+sindex0] + in14[i+sindex0] + in12[i+sindex2] + in15[i+sindex3];
}

Example 4: This loop involves reduction, short to int data conversion, and runtime alignment on b[i] and b[i+j] (because i is a triangle loop).

short b[M];
int a;
...
for (j=0; j<n; j++) {
a = 0;
for (i=0; i<M-j; i++)
a+= ((int)b[i]*(int)b[i+j])>>n;
}

Example 5: Unrolled loops.

for (i = m; i < n; i = i + 4) {
dy[i] = dy[i] + da*dx[i];
dy[i+1] = dy[i+1] + da*dx[i+1];
dy[i+2] = dy[i+2] + da*dx[i+2];
dy[i+3] = dy[i+3] + da*dx[i+3];
}

 

 Privacy | Legal | Contact | IBM Home | Research Home | Project List | Research Sites | Page Contact