用Metal做计算（一）简单的图像处理

除了进行图形渲染，我们还可以利用GPU硬件特点，将一些在CPU上执行起来很耗时的计算任务分配给GPU来完成(一些特定的计算任务，在GPU上快的真不是一点半点)。GPGPU Programming（General-purpose GPU Programming）的概念由来已久，但在使用OpenGL与GPU打交道时，我们只能用比较隐蔽的方式来实践，比如将我们想执行的计算任务嵌入到图形渲染管线当中。但有了Metal，我们就不需要这么拐弯抹角了。Metal提供了专门的计算管线，让我们可以用更加直接，易读的代码调度GPU来执行计算任务。接下来用一个简单的例子（调整图片的饱和度）来一起学习一下，如何使用Metal做计算。

Metal中的一些基本概念

在开始代码开发之前，我们首先对Metal中的基本类和概念进行下简单的回顾。包括：

MTLDevice
MTLCommandQueue
MTLCommandBuffer
MTLCommandEncoder
MTLCommand
MTLComputePipelineState & MTLLibrary & MTLFunction

乍一看，涉及到的概念比较多，但实际开发起来，这些类的串联方式还是很直观的，下面来一张图来整理一下

Metal Compute Graph.png

在初始化阶段，我们需要获得一个MTLDevice实例（可以理解它是GPU的操作接口），然后由Device创建一个MTLCommandQueue（所有像GPU发送的指令都需要首先放到CommandQueue当中）。另外，需要创建一个MTLLibrary对象（我的理解就是这里包含了编译好的shader方法），然后由Library获得用来描述具体计算任务的MTLFunction对象，再用Function对象来创建一个MTLComputePipelineState（类似渲染管线一样的东西，我们称为计算管线吧）。

在运行阶段，我们首先需要使用CommandQueue创建一个CommandBuffer出来，然后用CommandBuffer创建一个CommandEncoder，用来向CommandBuffer中写入指令。指令写入完成之后，调用CammandBuffer的commit方法，提交计算任务给GPU。

Talk is cheap

下面开始贴代码

    guard let device = MTLCreateSystemDefaultDevice() else  {
        return nil
    }
    guard let commandQueue = device.makeCommandQueue() else {
        return nil
    }
    
    guard let library = device.makeDefaultLibrary() else {
        return nil
    }
    guard let kernelFunction = library.makeFunction(name: "adjust_saturation") else {
        return nil
    }
    
    let computePipelineState: MTLComputePipelineState
    do {
        computePipelineState = try device.makeComputePipelineState(function: kernelFunction)
    } catch let _ {
        return nil
    }

在这段代码里，依次创建了MTLDevice，MTLCommandQueue，MTLLibrary，MTLFunction，MTLComputePipelineState等对象。

在创建MTLFunction实例的时用到的 adjust_saturation 是定义在.metal文件中的shader方法，方法内容如下：

kernel void adjust_saturation(texture2d<float, access::read> inTexture[[texture(0)]],
                              texture2d<float, access::write> outTexture[[texture(1)]],
                              constant float* saturation [[buffer(0)]],
                              uint2 gid [[thread_position_in_grid]]) {
    float4 inColor = inTexture.read(gid);
    float value = dot(inColor.rgb, float3(0.299, 0.587, 0.114));
    float4 grayColor(value, value, value, 1.0);
    float4 outColor = mix(grayColor, inColor, *saturation);
    outTexture.write(outColor, gid);
}

这个方法的参数有两张texture（一张用来做输入，另外一张做输出），一个float类型的参数，作为饱和度计算参数以及标记为 [[thread_position_in_grid]]的gid参数，暂时认为gid标记了本次计算在整个计算任务当中的id。

关于kernel方法内部的实现，这里就不多讲了，大致上是使用输入纹理中一个像素点的RGB值计算出它的灰度值，再根据saturation参数按一定比例混合彩色值与灰度值，输出一个饱和度修改后的结果，写入输出纹理当中。

接下来是执行计算相关的代码

    // prepare input texture
    let cmImage = cmImageFromUIImage(uiImage: image) // 自定义方法，从UIImage对象加载图片数据
    let textureDescriptor = MTLTextureDescriptor()
    textureDescriptor.width = cmImage.width
    textureDescriptor.height = cmImage.height
    textureDescriptor.pixelFormat = MTLPixelFormat.bgra8Unorm
    textureDescriptor.usage = .shaderRead
    let inTexture = device.makeTexture(descriptor: textureDescriptor)!
    let region = MTLRegion(origin: MTLOrigin(x: 0, y: 0, z: 0), size: MTLSize(width: cmImage.width, height: cmImage.height, depth: 1))
    inTexture.replace(region: region, mipmapLevel: 0, withBytes: NSData(data: cmImage.data!).bytes, bytesPerRow: cmImage.width * 4)
    
    // prepare output texture
    let outTextureDescriptor = MTLTextureDescriptor()
    outTextureDescriptor.width = cmImage.width
    outTextureDescriptor.height = cmImage.height
    outTextureDescriptor.pixelFormat = MTLPixelFormat.bgra8Unorm
    outTextureDescriptor.usage = MTLTextureUsage.shaderWrite
    let outTexture = device.makeTexture(descriptor: outTextureDescriptor)!
    
    guard let commandBuffer = commandQueue.makeCommandBuffer() else {
        return nil
    }
    
    guard let commandEncorder = commandBuffer.makeComputeCommandEncoder() else {
        return nil
    }
    
    commandEncorder.setComputePipelineState(computePipelineState)
    commandEncorder.setTexture(inTexture, index: 0)
    commandEncorder.setTexture(outTexture, index: 1)
    var saturation: float_t = 0.1
    commandEncorder.setBytes(&saturation, length: MemoryLayout<float_t>.size, index: 0)
    
    let width = cmImage.width
    let height = cmImage.height
    
    let groupSize = 16
    let groupCountWidth = (width + groupSize) / groupSize - 1
    let groupCountHeight = (height + groupSize) / groupSize - 1
    
    commandEncorder.dispatchThreadgroups(MTLSize(width: groupCountWidth, height: groupCountHeight, depth: 1), threadsPerThreadgroup: MTLSize(width: groupSize, height: groupSize, depth: 1))

    commandEncorder.endEncoding()

    commandBuffer.commit()

首先准备好两个MTLTexture对象，用来做计算的输入和输出。
然后创建CommandBuffer和CommandEncoder对象，用CommandEncoder对象配置计算管线，配置kernel方法的输入（inTexture, outTexture, saturation 等）。
最后通过dispatchThreadgroups方法，将计算任务分发到GPU。这里引入了Metal Compute中的另外的三个概念：

thread
thread group
grid size

首先，关于grid size

A compute pass must specify the number of times to execute a kernel function. This number corresponds to the grid size, which is defined in terms of threads and threadgroups.

即，grid size定义了一次GPU的compute pass里，shader方法需要执行的总次数。grid size使用MTLSize数据结构来定义，包含三个分量，在本例当中，grid size为（imageWidth, imageHeight, 1）。同时，根据文档的描述，我们不会直接去设置grid size，而是通过设置thread group size和thread group counts的方式来间接设置grid size。

关于 thread group size / thread group count

A threadgroup is a 3D group of threads that are executed concurrently by a kernel function.

thread group size定义了一次有多少计算被并行执行。thread group size的最大值和GPU硬件有关，在本例当中我们使用（16， 16，1），即一次有256个计算任务被并行执行。根据图片的分辨，我们可以计算得到thread group count。

最后，我们可以在GPU计算完成后，从outTexture获得计算结果，再转换成UIImage对象。

    commandBuffer.waitUntilCompleted()
    
    // create image from out texture
    let imageBytes = UnsafeMutablePointer<UInt8>.allocate(capacity: cmImage.width * cmImage.height * 4)
    outTexture.getBytes(imageBytes, bytesPerRow: cmImage.width * 4, from: region, mipmapLevel: 0)
    
    let context = CGContext(data: imageBytes, width: cmImage.width, height: cmImage.height, bitsPerComponent: 8, bytesPerRow: cmImage.width * 4, space: CGColorSpaceCreateDeviceRGB(), bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue)!
    let cgImage = context.makeImage()!
    return UIImage(cgImage: cgImage, scale: 1.0, orientation: UIImageOrientation.downMirrored)

UIImage --> MTLTexture

示例代码中，使用了一个自定义的方法从UIImage对象中获取像素数据，下面把相关代码贴出来，仅供参考

class CMImage: NSObject {
    var width: Int = 0
    var height: Int = 0
    var data: Data?
}

func cmImageFromUIImage(uiImage: UIImage) -> CMImage {
    let image = CMImage()
    image.width = Int(uiImage.size.width)
    image.height = Int(uiImage.size.height)
    
    let bytes = UnsafeMutablePointer<UInt8>.allocate(capacity: image.width * image.height * 4)
    let context = CGContext(data: bytes, width: image.width, height: image.height, bitsPerComponent: 8, bytesPerRow: image.width * 4, space: CGColorSpaceCreateDeviceRGB(), bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue)
    context?.translateBy(x: 0, y: uiImage.size.height)
    context?.scaleBy(x: 1, y: -1)
    context?.draw(uiImage.cgImage!, in: CGRect(x: 0, y: 0, width: uiImage.size.width, height: uiImage.size.height))
    image.data = Data(bytes: bytes, count: image.width * image.height * 4)
    
    return image
}

写在最后

为了图方便，在本例中，将Init Phase和Compute Pass相关的代码都塞入了一个方法当中，但根据苹果的最佳实践文档，Device， Library，CommandQueue，ComputePipeline等对象应当仅在App的初始化过程中创建一次，而不是每次执行计算都重复创建。

以上仅能算作Metal计算方面的Hello World，后面还有很多的内容值得我们去深入学习，感兴趣的朋友们一起加油吧！

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 202,406评论 5赞 475
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 84,976评论 2赞 379
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 149,302评论 0赞 335
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 54,366评论 1赞 273
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 63,372评论 5赞 363
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,457评论 1赞 281
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 37,872评论 3赞 395
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,521评论 0赞 256
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 40,717评论 1赞 295
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,523评论 2赞 319
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,590评论 1赞 329
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,299评论 4赞 318
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 38,859评论 3赞 306
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 29,883评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,127评论 1赞 259
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 42,760评论 2赞 349
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,290评论 2赞 342

用Metal做计算（一） 简单的图像处理

Metal中的一些基本概念

Talk is cheap

UIImage --> MTLTexture

写在最后

推荐阅读更多精彩内容

用Metal做计算（一）简单的图像处理