CS 848 Project - CUDA Data Parallel Primitives Library
Cătălin-Alexandru Avram - UWid: 20334317

Sugested Papers

[pdf | avi] Hensley, Justin, Thorsten Scheuermann, Greg Coombe, Montek Singh, and Anselmo Lastra. 2005. Fast Summed-Area Table Generation and Its Applications. Computer Graphics Forum 24(3), pp. 547-555.

[pdf] Crow, Franklin. 1984. Summed-Area Tables for Texture Mapping. In Computer Graphics (Proceedings of SIGGRAPH 1984) 18(3), pp. 207-212.

[pdf] Blelloch, Guy E. 1990. Prefix Sums and Their Applications. Tech report (Synthesis of Parallel Algorithms).

[pdf] Yuri Dotsenko, Naga K. Govindaraju, Peter-Pike Sloan, Charles Boyd, and John Manferdelli. 2008. Fast scan algorithms on graphics processors. In Proceedings of the 22nd annual international conference on Supercomputing (ICS '08). ACM, New York, NY, USA, 205-213. DOI=10.1145/1375527.1375559 http://doi.acm.org/10.1145/1375527.1375559

[pdf] Hagen Peters, Ole Schulz-Hildebrandt, and Norbert Luttenberger. Fast comparison-based in-place sorting with CUDA. In Eighth International Conference on Parallel Processing and Applied Mathematics, September 2009.

[pdf] Markus Billeter, Ola Olsson, and Ulf Assarsson. Efficient Stream Compaction on Wide SIMD Many-Core Architectures. In Proceedings of High Performance Graphics 2009, pages 159-166, August 2009.

[pdf] Vibhav Vineet, Pawan Harish, Suryakant Patidar, and P. J. Narayanan. Fast Minimum Spanning Tree for Large Graphs on the GPU. In Proceedings of High Performance Graphics 2009, pages 167-171, August 2009.

[pdf] Nadathur Satish, Mark Harris, and Michael Garland. Designing Efficient Sorting Algorithms for Manycore GPUs. In Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium, May 2009.

[pdf] Shubhabrata Sengupta, Mark Harris, and Michael Garland. Efficient Parallel Scan Algorithms for GPUs. Technical Report NVR-2008-003, NVIDIA Corporation, December 2008.

[pdf] Qiming Hou, Kun Zhou, and Baining Guo. BSGP: Bulk-Synchronous GPU Programming. ACM Transactions on Graphics, 27(3):19:1-19:13, August 2008.

[pdf | slides] Shubhabrata Sengupta, Mark Harris, Yao Zhang, and John D. Owens. Scan Primitives for GPU Computing. In Graphics Hardware 2007, pages 97-106, August 2007.

Other Publications that use CUDPP

[pdf] Deyuan Qiu, Stefan May, and Andreas Nüchter. GPU-accelerated Nearest Neighbor Search for 3D Registration. In ICVS 2009: Proceedings of the 7th International Conference on Computer Vision Systems, October 2009.

[pdf] Apeksha Godiyal, Jared Hoberock, Michael Garland, and John C. Hart. Rapid Multipole Graph Drawing on the GPU. In Proceedings of the 16th International Symposium on Graph Drawing, volume 5417 of Lecture Notes in Computer Science, pages 90-101. Springer, September 2009.

[pdf] Jared Hoberock, Victor Lu, Yuntao Jia, and John C. Hart. Stream Compaction for Deferred Shading. In Proceedings of High Performance Graphics 2009, pages 173-180, August 2009.

[pdf] Anjul Patney, Mohamed S. Ebeida, and John D. Owens. Parallel View-Dependent Tessellation of Catmull-Clark Subdivision Surfaces. In Proceedings of High Performance Graphics 2009, pages 99-108, August 2009.

[pdf] Sean P. Ponce. Towards Algorithm Transformation for Temporal Data Mining on GPU. Master's thesis, Department of Computer Science, Virginia Polytechnic Institute and State University, 7 July 2009.

[pdf] Christian Eisenacher, Quirin Meyer, and Charles Loop. Real-Time View-Dependent Rendering of Parametric Surfaces. In I3D '09: Proceedings of the 2009 Symposium on Interactive 3D Graphics and Games, pages 137-143, February/March 2009.

[pdf] Yannick Allusse, Patrick Horain, Ankit Agarwal, and Cindula Saipriyadarshan. GpuCV: A GPU-Accelerated Framework for Image Processing and Computer Vision. In Advances in Visual Computing, volume 5359 of Lecture Notes in Computer Science, pages 430-439. Springer, December 2008.

[pdf] Anjul Patney and John D. Owens. Real-Time Reyes-Style Adaptive Surface Subdivision. ACM Transactions on Graphics, 27(5):143:1-143:8, December 2008.

[pdf] Kun Zhou, Qiming Hou, Rui Wang, and Baining Guo. Real-time KD-tree Construction on Graphics Hardware. ACM Transactions on Graphics, 27(5):126:1-126:11, December 2008.

[acm] George Stantchev, William Dorland, and Nail Gumerov. Fast parallel Particle-To-Grid interpolation for plasma PIC simulations on the GPU. Journal of Parallel and Distributed Computing, 68(10):1339-1349, October 2008.

[pdf] Jike Chong, Youngmin Yi, Arlo Faria, Nadathur Satish, and Kurt Keutzer. Data-Parallel Large Vocabulary Continuous Speech Recognition on Graphics Processors. In Proceedings of the 1st Annual Workshop on Emerging Applications and Many Core Architecture (EAMA), pages 23-35, June 2008.

[pdf] Alexander Ladikos, Selim Benhimane, and Nassir Navab. Efficient Visual Hull Computation for Real-Time 3D Reconstruction using CUDA. In CVPRW '08: Computer Vision and Pattern Recognition Workshops, pages 1-8, June 2008.

[acm] Dominique Aubert, Mehdi Amini, and Romaric David. A Particle-Mesh Integrator for Galactic Dynamics Powered by GPGPUs. In Proceedings of the 9th International Conference on Computational Science, volume 5544 of Lecture Notes in Computer Science, pages 874-883. Springer, May 2008.

[pdf] Kun Zhou, Minmin Gong, Xin Huang, and Baining Guo. Highly Parallel Surface Reconstruction. Technical Report MSR-TR-2008-53, Microsoft Research, 1 April 2008.