💎一站式轻松地调用各大LLM模型接口,支持GPT4、智谱、星火、月之暗面及文生图 广告
# 9.6 使用Python CFFI混合C,C++,Fortran和Python **NOTE**:*此示例代码可以在 https://github.com/dev-cafe/cmake-cookbook/tree/v1.0/chapter-9/recipe-06 中找到,其中有一个C++示例和一个Fortran示例。该示例在CMake 3.11版(或更高版本)中是有效的,并且已经在GNU/Linux、macOS和Windows上进行过测试。* 前面的三个示例中,我们使用Cython、Boost.Python和pybind11作为连接Python和C++的工具。之前的示例中,主要连接的是C++接口。然而,可能会遇到这样的情况:将Python与Fortran或其他语言进行接口。 本示例中,我们将使用Python C的外部函数接口(CFFI,参见https://cffi.readthedocs.io)。由于C是通用语言,大多数编程语言(包括Fortran)都能够与C接口进行通信,所以Python CFFI是将Python与大量语言结合在一起的工具。Python CFFI的特性是,生成简单且非侵入性的C接口,这意味着它既不限制语言特性中的Python层,也不会对C层以下的代码有任何限制。 本示例中,将使用前面示例的银行帐户示例,通过C接口将Python CFFI应用于Python和C++。我们的目标是实现一个上下文感知的接口。接口中,我们可以实例化几个银行帐户,每个帐户都带有其内部状态。我们将通过讨论如何使用Python CFFI来连接Python和Fortran来结束本教程。 第11章第3节中,通过PyPI分发一个用CMake/CFFI构建的C/Fortran/Python项目,届时我们将重新讨论这个例子,并展示如何打包它,使它可以用`pip`安装。 ## 准备工作 我们从C++实现和接口开始,把它们放在名为`account/implementation`的子目录中。实现文件(`cpp_implementation.cpp`)类似于之前的示例,但是包含有断言,因为我们将对象的状态保持在一个不透明的句柄中,所以必须确保对象在访问时已经创建: ```c++ #include "cpp_implementation.hpp" #include <cassert> Account::Account() { balance = 0.0; is_initialized = true; } Account::~Account() { assert(is_initialized); is_initialized = false; } void Account::deposit(const double amount) { assert(is_initialized); balance += amount; } void Account::withdraw(const double amount) { assert(is_initialized); balance -= amount; } double Account::get_balance() const { assert(is_initialized); return balance; } ``` 接口文件(` cpp_implementation.hpp `)包含如下内容: ```c++ #pragma once class Account { public: Account(); ~Account(); void deposit(const double amount); void withdraw(const double amount); double get_balance() const; private: double balance; bool is_initialized; }; ``` 此外,我们隔离了C-C++接口(`c_cpp_interface.cpp`)。这将是我们与Python CFFI连接的接口: ```c++ #include "account.h" #include "cpp_implementation.hpp" #define AS_TYPE(Type, Obj) reinterpret_cast<Type *>(Obj) #define AS_CTYPE(Type, Obj) reinterpret_cast<const Type *>(Obj) account_context_t *account_new() { return AS_TYPE(account_context_t, new Account()); } void account_free(account_context_t *context) { delete AS_TYPE(Account, context); } void account_deposit(account_context_t *context, const double amount) { return AS_TYPE(Account, context)->deposit(amount); } void account_withdraw(account_context_t *context, const double amount) { return AS_TYPE(Account, context)->withdraw(amount); } double account_get_balance(const account_context_t *context) { return AS_CTYPE(Account, context)->get_balance(); } ``` `account`目录下,我们声明了C接口(`account.h`): ```c++ #ifndef ACCOUNT_API #include "account_export.h" #define ACCOUNT_API ACCOUNT_EXPORT #endif #ifdef __cplusplus extern "C" { #endif struct account_context; typedef struct account_context account_context_t; ACCOUNT_API account_context_t *account_new(); ACCOUNT_API void account_free(account_context_t *context); ACCOUNT_API void account_deposit(account_context_t *context, const double amount); ACCOUNT_API void account_withdraw(account_context_t *context, const double amount); ACCOUNT_API double account_get_balance(const account_context_t *context); #ifdef __cplusplus } #endif #endif /* ACCOUNT_H_INCLUDED */ ``` 我们还描述了Python接口,将在稍后对此进行讨论(`__init_ _.py`): ```python from subprocess import check_output from cffi import FFI import os import sys from configparser import ConfigParser from pathlib import Path def get_lib_handle(definitions, header_file, library_file): ffi = FFI() command = ['cc', '-E'] + definitions + [header_file] interface = check_output(command).decode('utf-8') # remove possible \r characters on windows which # would confuse cdef _interface = [l.strip('\r') for l in interface.split('\n')] ffi.cdef('\n'.join(_interface)) lib = ffi.dlopen(library_file) return lib # this interface requires the header file and library file # and these can be either provided by interface_file_names.cfg # in the same path as this file # or if this is not found then using environment variables _this_path = Path(os.path.dirname(os.path.realpath(__file__))) _cfg_file = _this_path / 'interface_file_names.cfg' if _cfg_file.exists(): config = ConfigParser() config.read(_cfg_file) header_file_name = config.get('configuration', 'header_file_name') _header_file = _this_path / 'include' / header_file_name _header_file = str(_header_file) library_file_name = config.get('configuration', 'library_file_name') _library_file = _this_path / 'lib' / library_file_name _library_file = str(_library_file) else: _header_file = os.getenv('ACCOUNT_HEADER_FILE') assert _header_file is not None _library_file = os.getenv('ACCOUNT_LIBRARY_FILE') assert _library_file is not None _lib = get_lib_handle(definitions=['-DACCOUNT_API=', '-DACCOUNT_NOINCLUDE'], header_file=_header_file, library_file=_library_file) # we change names to obtain a more pythonic API new = _lib.account_new free = _lib.account_free deposit = _lib.account_deposit withdraw = _lib.account_withdraw get_balance = _lib.account_get_balance __all__ = [ '__version__', 'new', 'free', 'deposit', 'withdraw', 'get_balance', ] ``` 我们看到,这个接口的大部分工作是通用的和可重用的,实际的接口相当薄。 项目的布局为: ```shell . ├── account │ ├── account.h │ ├── CMakeLists.txt │ ├── implementation │ │ ├── c_cpp_interface.cpp │ │ ├── cpp_implementation.cpp │ │ └── cpp_implementation.hpp │ ├── __init__.py │ └── test.py └── CMakeLists.txt ``` ## 具体实施 现在使用CMake来组合这些文件,形成一个Python模块: 1. 主`CMakeLists.txt`文件包含一个头文件。此外,根据GNU标准,设置编译库的位置: ```cmake # define minimum cmake version cmake_minimum_required(VERSION 3.5 FATAL_ERROR) # project name and supported language project(recipe-06 LANGUAGES CXX) # require C++11 set(CMAKE_CXX_STANDARD 11) set(CMAKE_CXX_EXTENSIONS OFF) set(CMAKE_CXX_STANDARD_REQUIRED ON) # specify where to place libraries include(GNUInstallDirs) set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR}) ``` 2. 第二步,是在`account`子目录下包含接口和实现的定义: ```cmake # interface and sources add_subdirectory(account) ``` 3. 主`CMakeLists.txt`文件以测试定义(需要Python解释器)结束: ```cmake # turn on testing enable_testing() # require python find_package(PythonInterp REQUIRED) # define test add_test( NAME python_test COMMAND ${CMAKE_COMMAND} -E env ACCOUNT_MODULE_PATH=${CMAKE_CURRENT_SOURCE_DIR} ACCOUNT_HEADER_FILE=${CMAKE_CURRENT_SOURCE_DIR}/account/account.h ACCOUNT_LIBRARY_FILE=$<TARGET_FILE:account> ${PYTHON_EXECUTABLE} ${CMAKE_CURRENT_SOURCE_DIR}/account/test.py ) ``` 4. ` account/CMakeLists.txt`中定义了动态库目标: ```cmake add_library(account SHARED plementation/c_cpp_interface.cpp implementation/cpp_implementation.cpp ) target_include_directories(account PRIVATE ${CMAKE_CURRENT_SOURCE_DIR} ${CMAKE_CURRENT_BINARY_DIR} ) ``` 5. 导出一个可移植的头文件: ```cmake include(GenerateExportHeader) generate_export_header(account BASE_NAME account ) ``` 6. 使用Python-C接口进行对接: ```shell $ mkdir -p build $ cd build $ cmake .. $ cmake --build . $ ctest Start 1: python_test 1/1 Test #1: python_test ...................... Passed 0.14 sec 100% tests passed, 0 tests failed out of 1 ``` ## 工作原理 虽然,之前的示例要求我们显式地声明Python-C接口,并将Python名称映射到C(++)符号,但Python CFFI从C头文件(示例中是`account.h`)推断出这种映射。我们只需要向Python CFFI层提供描述C接口的头文件和包含符号的动态库。在主`CMakeLists.txt`文件中使用了环境变量集来实现这一点,这些环境变量可以在`__init__.py`中找到: ```python # ... def get_lib_handle(definitions, header_file, library_file): ffi = FFI() command = ['cc', '-E'] + definitions + [header_file] interface = check_output(command).decode('utf-8') # remove possible \r characters on windows which # would confuse cdef _interface = [l.strip('\r') for l in interface.split('\n')] ffi.cdef('\n'.join(_interface)) lib = ffi.dlopen(library_file) return lib # ... _this_path = Path(os.path.dirname(os.path.realpath(__file__))) _cfg_file = _this_path / 'interface_file_names.cfg' if _cfg_file.exists(): # we will discuss this section in chapter 11, recipe 3 else: _header_file = os.getenv('ACCOUNT_HEADER_FILE') assert _header_file is not None _library_file = os.getenv('ACCOUNT_LIBRARY_FILE') assert _library_file is not None _lib = get_lib_handle(definitions=['-DACCOUNT_API=', '-DACCOUNT_NOINCLUDE'], header_file=_header_file, library_file=_library_file) # ... ``` `get_lib_handle`函数打开头文件(使用`ffi.cdef `)并解析加载库(使用` ffi.dlopen`)。并返回库对象。前面的文件是通用的,可以在不进行修改的情况下重用,用于与Python和C或使用Python CFFI的其他语言进行接口的其他项目。 `_lib`库对象可以直接导出,这里有一个额外的步骤,使Python接口在使用时,感觉更像Python: ```python # we change names to obtain a more pythonic API new = _lib.account_new free = _lib.account_free deposit = _lib.account_deposit withdraw = _lib.account_withdraw get_balance = _lib.account_get_balance __all__ = [ '__version__', 'new', 'free', 'deposit', 'withdraw', 'get_balance', ] ``` 有了这个变化,可以将例子写成下面的方式: ```python import account account1 = account.new() account.deposit(account1, 100.0) ``` 另一种选择则不那么直观: ```python from account import lib account1 = lib.account_new() lib.account_deposit(account1, 100.0) ``` 需要注意的是,如何使用API来实例化和跟踪上下文: ```python account1 = account.new() account.deposit(account1, 10.0) account2 = account.new() account.withdraw(account1, 5.0) account.deposit(account2, 5.0) ``` 为了导入`account`的Python模块,需要提供`ACCOUNT_HEADER_FILE`和`ACCOUNT_LIBRARY_FILE`环境变量,就像测试中那样: ```cmake add_test( NAME python_test COMMAND ${CMAKE_COMMAND} -E env ACCOUNT_MODULE_PATH=${CMAKE_CURRENT_SOURCE_DIR} ACCOUNT_HEADER_FILE=${CMAKE_CURRENT_SOURCE_DIR}/account/account.h ACCOUNT_LIBRARY_FILE=$<TARGET_FILE:account> ${PYTHON_EXECUTABLE} ${CMAKE_CURRENT_SOURCE_DIR}/account/test.py ) ``` 第11章中,将讨论如何创建一个可以用`pip`安装的Python包,其中头文件和库文件将安装在定义良好的位置,这样就不必定义任何使用Python模块的环境变量。 讨论了Python方面的接口之后,现在看下C的接口。` account.h`内容为: ```c++ struct account_context; typedef struct account_context account_context_t; ACCOUNT_API account_context_t *account_new(); ACCOUNT_API void account_free(account_context_t *context); ACCOUNT_API void account_deposit(account_context_t *context, const double amount); ACCOUNT_API void account_withdraw(account_context_t *context, const double amount); ACCOUNT_API double account_get_balance(const account_context_t *context); ``` 黑盒句柄`account_context`会保存对象的状态。`ACCOUNT_API`定义在`account_export.h`中,由`account/interface/CMakeLists.txt`生成: ```cmake include(GenerateExportHeader) generate_export_header(account BASE_NAME account ) ``` ` account_export.h `头文件定义了接口函数的可见性,并确保这是以一种可移植的方式完成的,实现可以在`cpp_implementation.cpp`中找到。它包含`is_initialized`布尔变量,可以检查这个布尔值确保API函数按照预期的顺序调用:上下文在创建之前或释放之后都不应该被访问。 ## 更多信息 设计Python-C接口时,必须仔细考虑在哪一端分配数组:数组可以在Python端分配并传递给C(++)实现,也可以在返回指针的C(++)实现上分配。后一种方法适用于缓冲区大小事先未知的情况。但返回到分配给C(++)端的数组指针可能会有问题,因为这可能导致Python垃圾收集导致内存泄漏,而Python垃圾收集不会“查看”分配给它的数组。我们建议设计C API,使数组可以在外部分配并传递给C实现。然后,可以在`__init__.py`中分配这些数组,如下例所示: ```python from cffi import FFI import numpy as np _ffi = FFI() def return_array(context, array_len): # create numpy array array_np = np.zeros(array_len, dtype=np.float64) # cast a pointer to its data array_p = _ffi.cast("double *", array_np.ctypes.data) # pass the pointer _lib.mylib_myfunction(context, array_len, array_p) # return the array as a list return array_np.tolist() ``` `return_array`函数返回一个Python列表。因为在Python端完成了所有的分配工作,所以不必担心内存泄漏,可以将清理工作留给垃圾收集。 对于Fortran示例,读者可以参考以下Git库:https://github.com/dev-cafe/cmake-cookbook/tree/v1.0/chapter09/recipe06/Fortran-example 。与C++实现的主要区别在于,account库是由Fortran 90源文件编译而成的,我们在`account/CMakeLists.txt`中使用了Fortran 90源文件: ```cmake add_library(account SHARED implementation/fortran_implementation.f90 ) ``` 上下文保存在用户定义的类型中: ```fortran type :: account private real(c_double) :: balance logical :: is_initialized = .false. end type ``` Fortran实现可以使用`iso_c_binding`模块解析`account.h`中定义的符号和方法: ```fortran module account_implementation use, intrinsic :: iso_c_binding, only: c_double, c_ptr implicit none private public account_new public account_free public account_deposit public account_withdraw public account_get_balance type :: account private real(c_double) :: balance logical :: is_initialized = .false. end type contains type(c_ptr) function account_new() bind (c) use, intrinsic :: iso_c_binding, only: c_loc type(account), pointer :: f_context type(c_ptr) :: context allocate(f_context) context = c_loc(f_context) account_new = context f_context%balance = 0.0d0 f_context%is_initialized = .true. end function subroutine account_free(context) bind (c) use, intrinsic :: iso_c_binding, only: c_f_pointer type(c_ptr), value :: context type(account), pointer :: f_context call c_f_pointer(context, f_context) call check_valid_context(f_context) f_context%balance = 0.0d0 f_context%is_initialized = .false. deallocate(f_context) end subroutine subroutine check_valid_context(f_context) type(account), pointer, intent(in) :: f_context if (.not. associated(f_context)) then print *, 'ERROR: context is not associated' stop 1 end if if (.not. f_context%is_initialized) then print *, 'ERROR: context is not initialized' stop 1 end if end subroutine subroutine account_withdraw(context, amount) bind (c) use, intrinsic :: iso_c_binding, only: c_f_pointer type(c_ptr), value :: context real(c_double), value :: amount type(account), pointer :: f_context call c_f_pointer(context, f_context) call check_valid_context(f_context) f_context%balance = f_context%balance - amount end subroutine subroutine account_deposit(context, amount) bind (c) use, intrinsic :: iso_c_binding, only: c_f_pointer type(c_ptr), value :: context real(c_double), value :: amount type(account), pointer :: f_context call c_f_pointer(context, f_context) call check_valid_context(f_context) f_context%balance = f_context%balance + amount end subroutine real(c_double) function account_get_balance(context) bind (c) use, intrinsic :: iso_c_binding, only: c_f_pointer type(c_ptr), value, intent(in) :: context type(account), pointer :: f_context call c_f_pointer(context, f_context) call check_valid_context(f_context) account_get_balance = f_context%balance end function end module ``` 这个示例和解决方案的灵感来自Armin Ronacher的帖子“Beautiful Native Libraries”: http://lucumr.pocoo.org/2013/8/18/beautiful-native-libraries/