Swift에서 [UInt8] -> [UInt8] -> [[UInt8]] 전환

[UInt32]를 [UInt8]으로 변환하는 함수의 현재 구현 속도를 높이려고합니다. [[UInt8]] 각 색인에 6 개의 배열이 있습니다.Swift에서 [UInt8] -> [UInt8] -> [[UInt8]] 전환

내 구현 :

extension Array { 
func splitBy(subSize: Int) -> [[Element]] { 
    return 0.stride(to: self.count, by: subSize).map { startIndex in 
     let endIndex = startIndex.advancedBy(subSize, limit: self.count) 
     return Array(self[startIndex ..< endIndex]) 
    } 
    } 
} 



func convertWordToBytes(fullW : [UInt32]) -> [[UInt8]] { 
    var combined8 = [UInt8]() 

    //Convert 17 [UInt32] to 68 [UInt8] 
    for i in 0...16{ 
     _ = 24.stride(through: 0, by: -8).map { 
      combined8.append(UInt8(truncatingBitPattern: fullW[i] >> UInt32($0))) 
     } 
    } 

    //Split [UInt8] to [[UInt8]] with 6 values at each index. 
    let combined48 = combined8.splitBy(6) 

    return combined48 
}

이 기능은 내 프로그램에 수백만 번 반복되며, 그 속도는 큰 부담이다.

누구나 아이디어가 있습니까? 감사합니다.

출처

2016-10-28 p0ppy

http://codereview.stackexchange.com – rmaddy

에 코드를 게시하는 것이 좋습니다. 코드가 Swift 2에 있습니다. Swift 2로 유지하거나 Swift 3으로 동시에 업데이트 하시겠습니까? –

이 컴퓨터는 너무 오래되었으므로 지금은 스위프트 2에 보관해야합니다. – p0ppy

코드 (Cmd + I)를 프로파일하면 대부분의 시간이 다양한 "버퍼에 복사"기능에 있음을 알 수 있습니다. 배열에 새 요소를 추가했지만 초기 할당 공간이 부족한 경우 더 많은 메모리가있는 힙의 위치로 이동해야합니다. 수업의 도덕 : 힙 할당은 느리지 만 배열에서는 불가피합니다. 가능한 한 적게하십시오.

이 시도 :

func convertWordToBytes2(fullW: [UInt32]) -> [[UInt8]] { let subSize = 6 // We allocate the array only once per run since allocation is so slow // There will only be assignment to it after var combined48 = [UInt8](count: fullW.count * 4, repeatedValue: 0).splitBy(subSize) var row = 0 var col = 0 for i in 0...16 { for j in 24.stride(through: 0, by: -8) { let value = UInt8(truncatingBitPattern: fullW[i] >> UInt32(j)) combined48[row][col] = value col += 1 if col >= subSize { row += 1 col = 0 } } } return combined48 }

벤치 마크 코드 :

let testCases = (0..<1_000_000).map { _ in (0..<17).map { _ in arc4random() } } testCases.forEach { convertWordToBytes($0) convertWordToBytes2($0) }

결과 (제 2012 아이맥에) 여러 할당을 제거, 우리가 이미하여 실행 시간을 단축함으로써

Weight Self Weight Symbol Name 9.35 s 53.2% 412.00 ms specialized convertWordToBytes([UInt32]) -> [[UInt8]] 3.28 s 18.6% 344.00 ms specialized convertWordToBytes2([UInt32]) -> [[UInt8]]

60 %. 그러나 각 테스트 케이스는 독립적이며 오늘날의 멀티 코어 CPU와의 병렬 처리에 완벽하게 적합합니다. 수정 된 루프는 ... :

Weight Self Weight Symbol Name 2.28 s 6.4% 0 s _dispatch_worker_thread3 0x58467 2.24 s 6.3% 0 s _dispatch_worker_thread3 0x58463 2.22 s 6.2% 0 s _dispatch_worker_thread3 0x58464 2.21 s 6.2% 0 s _dispatch_worker_thread3 0x58466 2.21 s 6.2% 0 s _dispatch_worker_thread3 0x58465 2.21 s 6.2% 0 s _dispatch_worker_thread3 0x58461 2.18 s 6.1% 0 s _dispatch_worker_thread3 0x58462

시간 절약이되지 않습니다 : 8 개 스레드 내 쿼드 코어 i7에 실행될 때
dispatch_apply(testCases.count, dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0)) { i in convertWordToBytes2(testCases[i]) }

은 ... 오프 벽 시간을 약 1 초를 면도한다 내가 원하는만큼. 분명히 힙 메모리에 액세스 할 때 약간의 경합이 있습니다. 더 빠른 경우라면 C 기반 솔루션을 탐색해야합니다.

출처

2016-10-30 03:50:00

대단히 감사합니다! 이로 인해 내 코드가 훨씬 빨라졌습니다. 벤치마킹 및 설명에 특별히 감사드립니다. – p0ppy

Swift에서 [UInt8] -> [UInt8] -> [[UInt8]] 전환

답변

관련 문제