IBM®
Skip to main content
    Country/region [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    

IBM Systems Journal

Online Game Technology   Volume 45, Number 1, 2006
Table of contents: HTMLPDF This article: HTMLPDF   Copyright info

Using advanced compiler technology to exploit the performance of the Cell Broadband Engine™ architecture - References

by A. E. Eichenberger,
J. K. O'Brien,
K. M. O'Brien,
P. Wu,
T. Chen,
P. H. Oden,
D. A. Prener,
J. C. Shepherd,
B. So,
Z. Sura,
A. Wang,
T. Zhang,
P. Zhao,
M. K. Gschwind,
R. Archambault,
Y. Gao,
and R. Koo
Cited references

  1. D. Pham, S. Asano, M. Bolliger, M. N. Day, H. P. Hofstee, C. Johns, J. Kahle, A. Kameyama, J. Keaty, Y. Masubuchi, M. Riley, D. Shippy, D. Stasiak, M. Suzuoki, M. Wang, J. Warnock, S. Weitzel, D. Wendel, T. Yamazaki, and K. Yazawa, “The Design and Implementation of a First-Generation CELL Processor,” Digest of Technical Papers, IEEE International Solid-State Circuits Conference (ISSCC 2005), IEEE International, Piscataway, NJ (February 2005), pp. 184–185, http://www-03.ibm.com/industries/telecom/doc/content/bin/tc_isscc_10.2_cell_design.pdf.
  2. PowerPC Microprocessor Family: AltiVec Technology Programming Environments Manual, IBM Corporation (July 2004).
  3. J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy, “Introduction to the Cell Multiprocessor,” IBM Journal of Research and Development 49, No. 4/5, 589–604 (July/September 2005).
  4. S. Larsen and S. Amarasinghe, “Exploiting Superword-Level Parallelism with Multimedia Instruction Sets,” Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, ACM Press, New York (June 2000), pp. 145–156, http://portal.acm.org/citation.cfm?id=349320.
  5. J. Shin, M. Hall, and J. Chame, “Superword-Level Parallelism in the Presence of Control Flow,” Proceedings of the International Symposium on Code Generation and Optimization (March 2005), pp. 165–175, http://doi.ieeecomputersociety.org/10.1109/CGO.2005.33.
  6. Aart Bik, Milind Girkar, Paul M. Grey, and Xinmin Tian, “Automatic Intra-Register Vectorization for the Intel Architecture,” International Journal of Parallel Programming 30, No. 2, pp. 65–98 (April 2002).
  7. D. Naishlos, M. Biberstein, S. Ben-David, and A. Zaks, “Vectorizing for a SIMDD DSP Architecture,” Proceedings of the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems (October 2003), pp. 2–11.
  8. Crescent Bay Software - VAST/AltiVec, http://www.crescentbaysoftware.com/vast_altivec.html.
  9. N. Sreraman and R. Govindarajan, “A Vectorizing Compiler for Multimedia Extensions,” International Journal of Parallel Programming 28, No. 4, 363–400 (August 2000).
  10. C. G. Lee and M. G. Stoodley, “Simple Vector Microprocessors for Multimedia Applications,” Proceedings of the 31st International Symposium on Microarchitecture, IEEE Computer Society Press, Los Alamitos, CA (1998), pp. 25–36, http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=290951.
  11. A. E. Eichenberger, P. Wu, and K. O'Brien, “Vectorization for SIMD Architectures with Alignment Constraints,” Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, ACM Press, New York (June 2004), pp. 82–93.
  12. P. Wu, A. E. Eichenberger, and A. Wang, “Efficient SIMD Code Generation for Runtime Alignment and Length Conversion,” Proceedings of the International Symposium on Code Generation and Optimization, IEEE Computer Society Press, Los Alamitos, CA (March 2005), pp. 153–164.
  13. P. Wu, A. E. Eichenberger, A. Wang, and P. Zhao, “An Integrated SIMDization Framework Using Virtual Vectors,” Proceedings of the 19th Annual International Conference on Supercomputing, ACM Press, New York (June 2005), pp. 169–178.
  14. Official OpenMP Specifications, OpenMP Architecture Review Board (2002), http://www.openmp.org/specs/.
  15. T. C. Mowry, “Tolerating Latency through Software-Controlled Data Prefetching,” Doctoral dissertation, Stanford University (March 1994).
  16. M. E. Wolf and M. S. Lam, “A Data Locality Optimizing Algorithm,” Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, ACM Press, New York (May 1991), pp. 30–44, http://portal.acm.org/citation.cfm?id=113449&coll=Portal&dl=GUIDE&CFID=54819031&CFTOKEN=14228294.
  17. G. Rivera and C.-W. Tseng, “Tiling Optimizations for 3D Scientific Computation,” Proceedings of the 2000 ACM/IEEE Conference on Supercomputing, IEEE Computer Society, Washington, DC, Online proceedings (November 2000) http://portal.acm.org/citation.cfm?id=370403&coll=Portal&dl=GUIDE&CFID=54819031&CFTOKEN=14228294.
  18. A. Badaway, A. Aggarwal, D. Yeung, and C.-W. Tseng, “Evaluating the Impact of Memory System Performance on Software Prefetching and Locality Optimizations,” Proceedings of the 15th International Conference on Supercomputing, ACM Press, New York (June 2001), pp. 486–500, http://portal.acm.org/citation.cfm?id=377906&coll=Portal&dl=GUIDE&CFID=54819031&CFTOKEN=14228294.
  19. J. Andrews and C. Polychronopoulos, “An Analytical Approach to Performance/Cost Modeling of Parallel Computers,” Journal of Parallel and Distributed Computing 12, No. 4, 343–356 (August 1991).
  20. D. J. Lilja, “A Multiprocessor Architecture Combining Fine-Grained and Coarse-Grained Parallelism Strategies,” Journal of Parallel Computing 20, No. 5, 729–751 (May 1994).


    About IBMPrivacyContact