2015年9月28日 星期一

CUDA: Check Dimension

since: 2015/09/28
update: 2015/09/28

reference:
1. Professional CUDA C Programming
2. I touchs: Get Started On NVIDIA Jetson TK1
3. I touchs: How to re-flash your Jetson TK1 Development Kit 
4. I touchs: Using the Jetson TK1 as a remote development environment for CUDA


A. 在 Nsight Eclipse Edition 撰寫程式如下:
     // 檔名: checkDimension.cu
#include <cuda_runtime.h>
#include <stdio.h>


__global__ void checkIndex(const int N)
{
    int i = blockIdx.x * blockDim.x + threadIdx.x;

    if (i < N)
    printf("threadIdx:(%d, %d, %d), blockIdx:(%d, %d, %d), blockDim:(%d, %d, %d), gridDim:(%d, %d, %d)\n",
               threadIdx.x, threadIdx.y, threadIdx.z,
               blockIdx.x, blockIdx.y, blockIdx.z,
               blockDim.x, blockDim.y, blockDim.z,
               gridDim.x, gridDim.y, gridDim.z);
}

int main(int argc, char **argv)
{
    // define total data element
    int nElem = 5;
    printf("\n");
    printf(">> total data count = %d \n", nElem);

    // define grid and block structure
    dim3 block(3);
    dim3 grid((nElem + block.x - 1) / block.x);
    printf(">> define block dimension = %d \n", block.x);
    printf(">> we get grid dimension = %d \n", grid.x);

    // check grid and block dimension from host side
    printf("\n* check grid and block dimension from host side *\n");
    printf("grid.x = %d , grid.y = %d , grid.z = %d\n", grid.x, grid.y, grid.z);
    printf("block.x = %d , block.y = %d , block.z = %d\n", block.x, block.y, block.z);
    printf("\n");

    // check grid and block dimension from device side
    printf("*check grid and block dimension from device side *\n");
    checkIndex<<<grid, block>>>(nElem);

    // reset device before you leave
    cudaDeviceReset();

    return 0;
}


-----------------------------------------------------------------------------------------------

B. 執行結果:

2015年9月24日 星期四

openFrameworks: v0.8.4 with Xcode 7.0

since: 2015/09/24
update: 2015/09/26
reference:
1. Warning : before upgrading to Xcode 7 - x-code - openFrameworks

A. 說明:
     Xcode 7 並沒有包含 OS X SDK 10.10 或更早的版本,
     可能會造成程式編譯上的錯誤:

         a. opensslconf.h file not found

         b. There is no SDK with the name or path

-----------------------------------------------------------------------------------------------

B. 解決方式:
    1. 在更新至 Xcode 7 前, 先從以下目錄來備份 MacOSX10.10.sdk:
         應用程式 > Xcode > (滑鼠右鍵)顯示套件內容 >
         Contents > Developer > Platforms > MacOSX.platform >
         Developer > SDKs > MacOSX10.10.sdk

        或者直接從網路下載: MacOSX10.10.sdk.dmg

    2. 更新至 Xcode 7 後, 將剛剛備份或下載的  MacOSX10.10.sdk ,
        複製回 Xcode 7 的相對的 SDKs 目錄內.

    3. 開啟 Xcode 的專案, 選取左邊的 openFrameworksLib.xcodeproj :
        右方 > Build Settings > Architectures > Base SDK: 選取 OS X 10.10

 4. 同樣地, 也需對 Xcode 本身的專案作相同的設定.

-----------------------------------------------------------------------------------------------

C. 備註:
     如果只是 openssl 的問題, 可以由以下的方式解決:  

     1. 在 Mac 上安裝 openssl:
          $ brew install openssl

     2. 開啟 Xcode 的專案, 點選左邊的 CoreOF.xcconfig   

     3. 在右方 HEADER 定義的最下方加入:
         ....
         //@add for openssl

         HEADER_OPENSSL = "$(OF_PATH)/libs/openssl/include"
         ....

     4. 在右方 OF_CORE_HEADERS =  的最後面加入:  $(HEADER_OPENSSL)
         ....
         ....

2015年9月5日 星期六

CUDA: Hello World!

since: 2015/09/05
update: 2015/09/05

reference:
1. Professional CUDA C Programming
2. I touchs: Get Started On NVIDIA Jetson TK1
3. I touchs: How to re-flash your Jetson TK1 Development Kit 
4. I touchs: Using the Jetson TK1 as a remote development environment for CUDA


A. 在 Nsight Eclipse Edition 撰寫程式如下:
     // 檔名: hello.cu
#include <stdio.h>

// __global__:
// function will be called on the CPU and executed on the GPU

__global__ void helloFromGPU()
{
    // blockIdx.x for block index
    // threadIdx.x for thread index

    printf("Hello World from GPU! block %d thread %d \n", blockIdx.x, threadIdx.x);
}

int main(void)
{
    // hello from cpu
    printf("\nHello World from CPU!\n\n");

    // kernel(GPU code) configuration: total threads = blockNum x threadNum
    int blockNum = 2;
    int threadNum = 3;

    // hello from gpu: call from the host thread to the code on the device side
    helloFromGPU <<<blockNum, threadNum >>>();
    cudaDeviceReset(); //destroy and clean up all resources
    //cudaDeviceSynchronize();

    return 0;
}


-----------------------------------------------------------------------------------------------

B. 執行結果: