RSS订阅信息安全技术跟踪与研究:技术、平台、会议、论文、产业
你现在的位置:首页 / 技术积累 / 正文

52知识点之一:各种处理器的区别

0 技术积累 | 2015年3月13日
转载申明:本站原创,欢迎转载。但转载时请保留原文地址。
原文地址:http://www.vonwei.com/post/1outof52.html

之前的博客“信息安全与密码学博士:应该掌握的52个知识点”介绍了52个基本知识点,本篇主要介绍第一个,主要介绍通用处理器、专用处理器(协处理器)、FPGA、指令集扩展等的区别,让大家对这些处理器有个进一步的认识和对比,首先给出Blogspot的英文原文博客,然后给出针对自己理解的翻译和补充,欢迎留言讨论和指正,谢谢!


52 Things: Number 1 : Different Types of Processors

Posted by Jake Longo Galea (原文链接:http://bristolcrypto.blogspot.co.uk/2014/10/52-things-number-1-different-types-of.html

This is the first in a series of blog posts to address the list of '52 Things Every PhD Student Should Know' to do Cryptography. The set of questions has been compiled to give PhD candidates a sense of what they should know by the end of their first year - and as an early warning system to advise PhD students they should get out while they still can! In any case, we will be presenting answers to each of the questions over the next year and I have been volunteered for the (dubious) honour of blogging the very first 'thing'. The first topic is on computer architecture and is presented as the following question:

 

What is the difference between the following?

- A general-purpose processor.

- A general-purpose processor with instruction-set extensions.

- A special-purpose processor (or co-processor).

- An FPGA.

There is no strict definition of a general-purpose processor, however, it is widely accepted that a processor is general if it is Turing-complete. This captures any processor that is able to compute anything actually computable (i.e. can solve any problem a Turing machine can). I will not delve into the definition of a Turing machine but if I've already lost you then I would recommend brushing up on your Theory of Computation [1]. Note though that this definition has no concept of performance or instruction-set capabilities and, in fact, some researchers have gone through the trouble of proving that you may only need a single instruction to be Turing-complete [2]. In the context of modern processors, most programmable CPUs are considered general purpose.

       The cost of being general-purpose usually comes at a penalty in performance. Specifically, a general-purpose processor may be able to compute anything computable but, it will never excel at complex repeated tasks. Given a task that is repeated regularly on a general-purpose processor in a wide variety of applications, a processor designer may incorporate instruction-set extensions to the base micro-architecture to accommodate the task. Functionally, there may be no difference in the micro-architecture capabilities but practically there may be huge performance gains for the end-user.

       As we're all cryptographers here, I will stick to a crypto example for instruction-set extensions. Consider a desktop machine with an AES encrypted disk. Any reads from secondary storage require a CPU interrupt to decrypt the data blocks before being cached. Given disk access from a cache miss is already considered terrible, add the decryption routine over the top and you have a bottleneck worth making you re-consider your disk encryption. It should be clear here that AES is our complex repeated task and given a general-purpose CPU with a simple instruction-set, we have no choice but to implement the decryption as a linear stream operations. Intel and AMD both recognised the demand for disk encryption and the penalty AES adds to secondary storage access and have (since circa 2010) produced the AES-NI x86 instruction-set extension to accelerate disk encryption on the their line of desktop CPUs.

       If you want to fully accelerate any computation, the most optimal result will always be a special-purpose processor or an Application-specific integrated circuit (ASIC). Here we lose a significant portion of the flexibility granted by a general-purpose processor in exchange for performance gains. These types of processors are often tightly-coupled to a general-purpose processor, hence the term co-processor. Note, a co-processor may indeed be in the same package as a general-purpose processor but not necessarily integrated into the general-purpose architecture. Once again, if we turn to modern processor architectures, Intel and AMD have both integrated sound cards, graphics processors and DSP engines into their CPUs for some time now. The additional functionality is exposed via special-purpose registers and the co-processor treated as a separate component which the general-purpose processor must manage.

       Finally we turn to Field-Programmable Gate Arrays (FPGAs). The middle-ground between ASIC and general-purpose processors. If an application demands high-performance throughput but also requires (infrequent) modification then an FPGA is probably the answer. To understand how an FPGA works, consider a (very) large electronics breadboard with thousands of logic-gates and lookup tables (multiplexers attached to memory) placed all around the breadboard. If you describe an application as a set of gates and timing constraints then you can wire it together on the breadboard and produce a circuit that will evaluate your application. An FPGA affords the flexibility of being re-programmable whilst producing the dedicated logic to evaluate a target application. The key difference to a general-purpose program is how you design and build your application. In order to get the most out of the hardware you must describe the application as a set of hardware components and events using a hardware description language (Verilog or VHDL). This process is frequently used to prototype general-purpose and special-purpose processors on FPGAs before production. However, it is not without its drawbacks. Designing a program with low-level building blocks becomes very cumbersome for large applications. In addition, the energy consumption and hardware costs are generally higher in comparison to a general-purpose embedded IC. Recently, FPGA manufacturer Xilinx have begun shipping FPGAs with ARM general-purpose cores integrated within a single package [3]. This now makes FPGAs available to the ARM core as a flexible co-processor. As a result, you can build dedicated logic to evaluate your crypto primitives and hence accelerate cryptographic applications.

       In summary, general-purpose processors are capable of computing anything computable. Similarly for a general-purpose processor with instruction-set extensions and it may perform better in particular applications. A special-purpose (or co-processor) is very fast at a particular task but is unable to compute anything outside of that. An FPGA can be used to build all of the above hardware but sacrifices speed for flexibility over an ASIC solution.

 

翻译与补充:

       对于通用处理器并没有严格的定义,不过最广泛接受的概念是“通用处理器是图灵完备的”,也就是说,通用处理器计算能力与一个通用图灵机相当,能计算出图灵可计算函数的结果(一切可计算的问题都能计算)。而关于图灵机的定义,可以参考文献[1]中关于计算理论的介绍,也是一个很大的知识体系,这里就不多描述了。注意,虽然这个定义并没有关于性能或者指令集能力方面的概念,不过文献[2]中的研究者已经证明了一条独立的指令也可以是图灵完备的。所有的通用编程语言和现代计算机的指令集都是图灵完备的,现在处理器中的大部分可编程CPUs都可以认为是通用处理器。

       处理器达到通用目的的代价通常会导致性能方面的不足。特别地,一个通用处理器可以计算任何可计算的任务,但是却不会擅长处理那些复杂且重复的任务。在一个通用处理器上跑一个重复的任务时,处理器设计者可能在基本微架构下包含其他指令集扩展,以便处理这个特别的任务。功能上,微架构能力没有变化,但是实际中对终端用户而言,可能带来很大的性能改善。对于CPU来说,在基本功能方面,它们的差别并不太大,基本的指令集也都差不多,但是许多厂家为了提升某一方面性能,又开发了扩展指令集,扩展指令集定义了新的数据和指令,能够大大提高某方面数据处理能力,但可能需要有相应软件支持。

       这里列举关于指令集扩展的密码学方面的一个例子,假设一个桌面机器配备一个AES加密磁盘。这样,任何从该二级存储(即磁盘)读取数据,都需要一个CPU中断来先对数据块进行解密,才能加载到内存处理。从缓存访问磁盘被认为是很糟糕的(影响访问性能),需要在磁盘上面增加一个解密程序,这样会存在一个瓶颈,值得重新考虑磁盘加密。很显然,这里AES就是我们需要执行的复杂且重复的任务,给定一个简单指令集的通用CPU,我们没有其它选择,只能将解密实现为一个线性的流操作。IntelAMD都已经承认了磁盘加密以及将AES增加到二级存储访问的需求,已经制造了AES-NI x86指令集扩展来加速磁盘加密操作。

       如果想针对任何计算进行全面加速,最优选择就是专用处理器或者专用集成电路ASIC。这样可以提升性能,但是作为代价会失去通用处理器的灵活性。这些类型的专用处理器通常与通用处理器是紧密耦合的,因此也称为协处理器。以现代处理器体系为例,IntelAMD都已经将声卡、图形处理器以及DSP引擎整合到了它们的CPUs中。增加的功能通过专用寄存器实现,协处理器被视为一个隔离的组件,需要通过通用处理器进行管理。辅助处理器,也翻译为协处理器,是为了协助中央处理器(通用处理器)进行对其无法执行或执行效率、效果低下的处理工作而研究开发使用处理器。这些中央处理器无法执行的工作有很多,比如设备间的信号传输、接入设备的管理等;而执行效率、效果低下的有图形处理、声频处理等。为了进行这些处理,各种辅助处理器就诞生了。需要说明的是,由于现在的计算机中,整数运算器与浮点运算器已经集成在一起,因此浮点处理器已经不算是辅助处理器。而内建于CPU中的协处理器,同样不算是辅助处理器,除非它是独立存在。

       最后讨论一下场可编程门阵列FPGA,处于通用处理器和专用处理器的中间地带。如果一个应用需要高性能的吞吐量,但也需要少量修改,那么一个FPGA可能是最好的选择。为了理解FPGA如何工作,可以考虑一个大型电子电路试验板,拥有成千上万的逻辑门和查询表(通过多路复用的方式附着于内存)。如果你将应用描述为一组门和时序约束的集合,那么你可以在电路板上用线将其连接在一起,形成一个电路,该电路可以用于计算你的应用。一个FPGA能提供可重编程的灵活性,同时能生成专用逻辑电路来计算一个目标应用。与通用程序相比的一个关键不同点是,如何设计和构建你的应用,需要使用硬件描述语言(如Verilog或者VHDL)来将你的应用描述为一组硬件组件和事情的集合。这个过程通常在生产芯片之前,用于在FPGAs上对通用处理器和专用处理器进行原型实现。但是,它也并非没有缺点。对于大型应用,使用低级构建块来设计程序是非常麻烦的。此外,与通用嵌入式集成电路相比,其能源消耗和硬件成本更高。最近,FPGA厂商Xilinx(赛灵思)将ARM通用核装载在FPGAs,集成在一个单独的包中[3]。这使得FPGAs可以作为ARM核的一个灵活的协处理器使用。因此,可以构建专门的逻辑电路来计算你的密码原语,以便加速密码操作。

       总之,通用处理器能够计算任何可计算的。拥有指令集扩展的通用处理器与通用处理器类似,不过其针对一些特定的应用能更好的执行。专用处理器(或者协处理器)在计算一些特定任务时很快,但是无法计算其之外的其它应用。FPGA可以用于构建以上所有硬件,但是与一个ASIC方法比较,其保证灵活性的同时牺牲了速度。CPUFPGA的根本区别在于软件与硬件的差异。CPU为冯诺依曼结构,串行地执行一系列指令;而FPGA可以实现并行操作,就像在一个芯片中嵌入多个CPU,其性能会是单个CPU的十倍、百倍。一般来说,CPU可以实现的功能,都可以用硬件设计的方法由FPGA来实现。当然,极其复杂的算法用硬件实现会比较困难,资源消耗也很大,如果没有高性能要求,那用硬件实现就有点得不偿失了。对于一个复杂系统而言,进行合理的软、硬件划分,由CPU(或DSP)和硬件电路(如FPGA)合作完成系统功能是非常必要的,也是高效的。

 

[1] http://www.amazon.co.uk/Introduction-Theory-Computation-Michael-Sipser/dp/0619217642

[2] http://www.cl.cam.ac.uk/~sd601/papers/mov.pdf

[3] http://www.xilinx.com/products/zynq-7000/extensible-virtual-platform.htm

 



  • ------------------分隔线----------------

  • 如果感兴趣,欢迎关注本站微信号,跟踪最新博文信息,手机微信扫一扫下面的二维码,即可关注!
  • 微月信公众号
  • 推荐您阅读更多有关于“ 通用处理器  专用处理器  协处理器  指令集扩展  FPGA  52知识点   ”的文章

    请填写你的在线分享代码
    上一篇:基于最小假设的快速安全两方函数计算:Fast Two-Party Secure Computation with Minimal Assumptions下一篇:52知识点之二:多核处理器与向量处理器

    猜你喜欢

    评论列表:

    发表评论

    必填

    选填

    选填

    必填,不填不让过哦,嘻嘻。

    记住我,下次回复时不用重新输入个人信息

    本站介绍
    最近发表
    本年最热文章
    本月最热文章
    网站分类
    文章归档