CUDA Application Design and Development

November 3, 2011  SteveH

CUDA Application Design and Development by Rob Farber just published!

Review the full Table of Contents below.

About the Book

As the computer industry retools to leverage massively parallel graphics processing units (GPUs), this book is designed to meet the needs of working software developers who need to understand GPU programming with CUDA and increase efficiency in their projects. CUDA Application Design and Development starts with an introduction to parallel computing concepts for readers with no previous parallel experience, and focuses on issues of immediate importance to working software developers: achieving high performance, maintaining competitiveness, analyzing CUDA benefits versus costs, and determining application lifespan.

The book then details the thought behind CUDA and teaches how to create, analyze, and debug CUDA applications. Throughout, the focus is on software engineering issues: how to use CUDA in the context of existing application code, with existing compilers, languages, software tools, and industry-standard API libraries.

Using an approach refined in a series of well-received articles at Dr Dobb’s Journal, author Rob Farber takes the reader step-by-step from fundamentals to implementation, moving from language theory to practical coding.

Key Features

  • Includes multiple examples building from simple to more complex applications in four key areas: machine learning, visualization, vision recognition, and mobile computing
  • Addresses the foundational issues for CUDA development: multi-threaded programming and the different memory hierarchy
  • Includes teaching chapters designed to give a full understanding of CUDA tools, techniques and structure
  • Presents CUDA techniques in the context of the hardware they are implemented on as well as other styles of programming that will help readers bridge into the new material

Quotes:

“The most important thing that this book will offer is the application specific examples and the additional detail of how to optimize CUDA for different application areas. This book will also be valuable as an alternative to the existing textbooks, since it is written by a user with an application perspective.

- David Kirk, author of Programming Massively Parallel Processors and former NVIDIA Chief Scientist


Table of Contents:

CHAPTER 1: First Programs and How to Think in CUDA

Source Code and Wiki

Distinguishing CUDA from Conventional Programming with a Simple Example

Choosing a CUDA API

Some Basic CUDA Concepts

Understanding Our First Runtime Kernel

Three Rules of GPGPU Programming

Big-O Considerations and Data Transfers

CUDA and Amdahl’s Law

Data and Task Parallelism

Hybrid Execution: Using Both CPU and GPU Resources

Regression Testing and Accuracy

Silent Errors

Introduction to Debugging

UNIX Debugging

Windows Debugging with Parallel Nsight

Summary

CHAPTER 2: CUDA for Machine Learning and Optimization

Modeling and Simulation

Machine Learning and Neural Networks

XOR: An Important Nonlinear Machine-Learning Problem

Performance Results on XOR

Performance Discussion

Summary

The C++ Nelder-Mead Template

CHAPTER 3: The CUDA Tool Suite: Profiling a PCA/NLPCA

Functor

PCA and NLPCA

Obtaining Basic Profile Information

Gprof: A Common UNIX Profiler

The NVIDIA Visual Profiler: Computeprof

Parallel Nsight for Microsoft Visual Studio

Tuning and Analysis Utilities (TAU)

Summary

CHAPTER 4: The CUDA Execution Model

GPU Architecture Overview

Warp Scheduling and TLP

ILP: Higher Performance at Lower Occupancy

Little’s Law

CUDA Tools to Identify Limiting Factors

Summary

CHAPTER 5: CUDA Memory

The CUDA Memory Hierarchy

GPU Memory

L2 Cache

L1 Cache

CUDA Memory Types

Global Memory

Summary

CHAPTER 6: Efficiently Using GPU Memory

Reduction

Utilizing Irregular Data Structures

Sparse Matrices and the CUSP Library

Graph Algorithms

SoA, AoS, and Other Structures

Tiles and Stencils

Summary

CHAPTER 7: Techniques to Increase Parallelism

CUDA Contexts Extend Parallelism

Streams and Contexts

Out-of-Order Execution with Multiple Streams

Tying Data to Computation

Summary

CHAPTER 8: CUDA for All GPU and CPU Applications

Pathways from CUDA to Multiple Hardware Backends

Accessing CUDA from Other Languages

Libraries

CUBLAS

CUFFT

Summary

CHAPTER 9: Mixing CUDA and Rendering

OpenGL

GLUT

Introduction to the Files in the Framework

Summary

CHAPTER 10: CUDA in a Cloud and Cluster Environments

The Message Passing Interface (MPI)

How MPI Communicates

Bandwidth

Balance Ratios

Considerations for Large MPI Runs

Cloud Computing

A Code Example

Summary

CHAPTER 11: CUDA for Real Problems

Working with High-Dimensional Data

PCA/NLPCA

Force-Directed Graphs

Monte Carlo Methods

Molecular Modeling

Quantum Chemistry

Interactive Workflows

A Plethora of Projects

Summary

CHAPTER 12: Application Focus on Live Streaming Video

Topics in Machine Vision

FFmpeg

TCP Server

Contents ix

Live Stream Application

The simpleVBO.cpp File

The callbacksVBO.cpp File

Building and Running the Code

The Future

Summary

ISBN: 9780123884268 | View in bookstore

Bookmark and Share

No Comments
Tell us what you think!

Comments

*