Cooking GPU
Let's play with GPU for the computationally massive projects !!
Tuesday, April 4, 2017
Please feel free to make a suggestion.
We would appreciate any comments or suggestions that will help us to enhance the blog.
Thursday, May 15, 2014
Example codes for our book
- Download example codes from the publisher's website: http://booksite.elsevier.com/9780124080805/
Erratum
We realized that there is an error in the textbook example in Chapter 2.5 (Simple Vector Addition Using CUDA in page 28).
Errorneous code example (AddVectors.cu)
#include "AddVectors.h"
#include "mex.h"
__global__ void addVectorsMask(float* A, float* B, float* C, int size)
{
int i = blockIdx.x;
if(i >= size)
return;
C[i] = A[i] +B[i];
}
Since this is the .cu file, we don't need the "mex.h" file.
Actually we already corrected in our example codes (downloaded from the publisher's website), but we didn't delete it in our manuscript by mistake.
After fix (AddVectors.cu):
#include "AddVectors.h"
// #include "mex.h"
__global__ void addVectorsMask(float* A, float* B, float* C, int size)
{
int i = blockIdx.x;
if(i >= size)
return;
C[i] = A[i] +B[i];
}
Block out or delete the line: #include "mex.h"
Thank you and appologize for your convenience.
Jung W. Suh
Errorneous code example (AddVectors.cu)
#include "AddVectors.h"
#include "mex.h"
__global__ void addVectorsMask(float* A, float* B, float* C, int size)
{
int i = blockIdx.x;
if(i >= size)
return;
C[i] = A[i] +B[i];
}
Since this is the .cu file, we don't need the "mex.h" file.
Actually we already corrected in our example codes (downloaded from the publisher's website), but we didn't delete it in our manuscript by mistake.
After fix (AddVectors.cu):
#include "AddVectors.h"
// #include "mex.h"
__global__ void addVectorsMask(float* A, float* B, float* C, int size)
{
int i = blockIdx.x;
if(i >= size)
return;
C[i] = A[i] +B[i];
}
Block out or delete the line: #include "mex.h"
Thank you and appologize for your convenience.
Jung W. Suh
Monday, April 14, 2014
Friday, January 10, 2014
Then, How to choose the GPU usage either the GPU-utilization through C-MEX or the Parallel Computing Toolbox for MATLAB?
Finally our book “Accelerating MATLAB using GPU Computing”
was published. Although this blog is not limited to the GPU computing under
MATLAB environment, we are going to start a series of postings on
MATLAB-related GPU computing topics we missed while we were preparing the book
manuscript due to the tight deadline (Actually we have extended our manuscript
due date several times).
In the book, we already spent a lot of spaces to explain
both ways of making use of GPU under MATLAB. One was a more general way of the GPU-utilization
through C-MEX. Another was the simply use of Mathworks’ Parallel Computing
Toolbox for GPU.
Today, we are going to deal with another topic “Then, How to
choose the GPU usage either the GPU-utilization through C-MEX or the Parallel
Computing Toolbox for accelerating your MATLAB codes”. Both ways have their own
strength and weakness. Now let’s list up
their features.
|
Parallel
Computing Toolbox
|
C-MEX with CUDA
|
Easiness
|
easy
|
complicated
|
Cost
|
Need extra cost
for buying this toolbox
|
No extra cost
|
Flexibility
|
Not flexible
|
Flexible
|
Performance
|
Limited
|
Excellent
|
Limitation
|
A lot
|
small
|
For the easiness point of view, when we choose the GPU-utilization
through C-MEX, users should explicitly assign block size and grid size in
addition to memory allocation, which make users consider many things for
efficiency. Users of Parallel Computing
Toolbox don’t have to consider the memory allocation and block/grid size for
both MATLAB built-in function and non-built-in function, which make feel
easier. However, those automatic assignment and allocation in the Parallel
Computing Toolbox also result in the limitation of performance gain because of
the fixed method. Although you may use the NVIDIA’s Visual Profiler for
analyzing your Parallel Computing Toolbox codes, there is very small room for
improvement from it. Actually, the
flexibility and the easiness can be the flip sides of a coin.
Usually the limitation of the GPU-utilization through C-MEX
for MATLAB is the generic limitation of GPU itself while the limitation of the
Parallel Computing Toolbox comes from the implementation limitation within the
toolbox.
The bottom line is that if you want small performance improvement
for simple project with extra budget then the Parallel Computing Toolbox is a
good choice for accelerating your MATLAB codes. If your project is huge and
want to have a big performance benefit, then the GPU-utilization through C-MEX
would be a good choice.
Subscribe to:
Posts (Atom)