[Py] pyOpenCL 初學筆記 on Mac

pyOpenCL 初學筆記 (凌亂未整理版)

安裝:使用 Anaconda distribution :

conda install pyopencl

下面是查詢 cl.platform 的資訊。

Python 3.5.2 |Anaconda custom (x86_64)| (default, Jul  2 2016, 17:52:12) 
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> import pyopencl as cl

# 查詢電腦上支援 OpenCL 的裝置,傳回一個 list 包含 pyopencl.Platform 物件
>>> cl.get_platforms()
[<pyopencl.Platform 'Apple' at 0x7fff0000>]

>>> plat = cl.get_platforms()

# 查詢 pyopencl.Platform 各種屬性: name, extensions , profile , version
>>> plat[0].name
>>> plat[0].extensions
'cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event'
>>> plat[0].profile
>>> plat[0].vendor
>>> plat[0].version
'OpenCL 1.2 (Oct 14 2016 20:24:13)'

搜尋可用的裝置:pyopencl.Platform 物件中 get_device()

>>> plat[0].get_devices()
[<pyopencl.Device 'Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz' on 'Apple' at 0xffffffff>, <pyopencl.Device 'HD Graphics 4000' on 'Apple' at 0x1024400>]
# 傳回 pyopencl.Device class 的清單
# 上述程式碼和以下是等義的:
devices = platforms[0].get_devices(cl.device_type.ALL)

cl.device_type 有以下分類:

    An OpenCL device that is the host processor. The host processor runs the OpenCL implementations and is a single or multi-core CPU.
    An OpenCL device that is a GPU. By this we mean that the device can also be used to accelerate a 3D API such as OpenGL or DirectX.
    Dedicated OpenCL accelerators (for example the IBM CELL Blade). These devices communicate with the host processor using a peripheral interconnect such as PCIe.
    Dedicated accelerators that do not support programs written in OpenCL C.
    The default OpenCL device in the system. The default device cannot be aCL_DEVICE_TYPE_CUSTOM device.
    All OpenCL devices available in the system except CL_DEVICE_TYPE_CUSTOM devices..

所以只想查詢 GPU 的話,可以用:

>>> plat[0].get_devices(cl.device_type.GPU)
[<pyopencl.Device 'HD Graphics 4000' on 'Apple' at 0x1024400>]

接著是查詢 Device 的相關屬性(Properties),即 pyOpenCL 中 pyopencl.Device 物件屬性,如:記憶體大小、位元等。

# pyOpenCL 中 pyopencl.Device 物件:
>>> plat[0].get_devices()[0]
<pyopencl.Device 'Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz' on 'Apple' at 0xffffffff>
# Global 記憶體大小 (bytes)
>>> plat[0].get_devices()[0].global_mem_size
# 最大 Working Group Size
>>> plat[0].get_devices()[0].max_work_group_size
# 裝置名稱
>>> plat[0].get_devices()[0].name
'Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz'
# 裝置製造商
>>> plat[0].get_devices()[0].vendor
# 位元?
>>> plat[0].get_devices()[0].address_bits
>>> plat[0].get_devices()[0].extensions
'cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_APPLE_fp64_basic_ops cl_APPLE_fixed_alpha_channel_orders cl_APPLE_biased_fixed_point_image_formats cl_APPLE_command_queue_priority'

其他資訊可見技術文件(Specifications)中第 37 至 49 頁,將前綴「CL_DEVICE_ 」去掉並小寫後即可得到相關資訊。

Building and Deploying a Kernel

To build and deploy a basic OpenCL kernel, you usually need to follow these steps in a typical OpenCL C++ host program:

  1. Obtain an OpenCL platform. 獲得平台資訊。
  2. Obtain a device id for at least one device (accelerator). 獲得裝置資訊(CPU、GPU…)
  3. Create a context for the selected device or devices. (?)
  4. Create the accelerator program from source code. (撰寫 openCL C 程式)
  5. Build the program. (編譯程式)
  6. Create one or more kernels from the program functions.
  7. Create a command queue for the target device. (創造命令佇(ㄓㄨˋ)列)
  8. Allocate device memory and move input data from the host to the device memory. (在裝置端劃分記憶體並由主控端將資料送入裝置記憶體中)
  9. Associate the arguments to the kernel with kernel object. (?在裝置中設置 kernel 物件的變數)
  10. Deploy the kernel for device execution.(指派裝置 kernel 核心執行)
  11. Move the kernel’s output data to host memory. (將輸出傳送到主控端記憶體中)
  12. Release context, program, kernels and memory. (清除及釋放內容?程式、核心與記憶體)



這個網站採用 Akismet 服務減少垃圾留言。進一步瞭解 Akismet 如何處理網站訪客的留言資料