Sciencemadness Discussion Board
Not logged in [Login ]
Go To Bottom

Printable Version  
Author: Subject: FPGA vs. ASIC performance
woelen
Super Administrator
*********




Posts: 8013
Registered: 20-8-2005
Location: Netherlands
Member Is Offline

Mood: interested

[*] posted on 16-4-2013 at 05:15
FPGA vs. ASIC performance


I am investigating the possibilities to create a cheap and versatile device which can do many cryptographic hash computations per second (preferrably several billions of such computations per second, each hash being a 256 bit number).

CPU-based and GPU-based solutions are not feasible. The fastest and most expensive GPU-based solutions only are capable of computing a few hundreds of millions of cryptohashes per second.

I am looking into FPGA-based solutions, but I wonder, how well do modern FPGA's compare to ASIC's? I consider an ASIC out of my reach, due to the very high non-recurring engineering costs. FPGA's can be programmed and can be reprogrammed if things are wrong and they can be used for other purposes as well if the project is no success. My experience in this field is old and limited and I wonder what the current state of the art technology is? Is it possible to create a FPGA-based solution for $500 or less (for materials, time is not an issue, it's an amateur project) which can compete with an ASIC? ASIC are interesting if the solution is produced in 1000-fold or even more, but my project only needs one instance (one for my own use) and then the use of ASIC's is not an option at all to my opinion.

I have seen FPGA-boards, which can be connected to a USB2 port and there also are boards, which can connect to a PCI-bus. Many of these boards have clock frequencies in the order of magnitude of a few hundreds of MHz and the more luxuous boards have embedded cpu cores as well (e.g. a Cortex A9 running at 700 MHz or so). But the CPU is not interesting for what I want, it may however help in orchestration of the processing.

If you know of a particular cost effective and very fast FPGA, then that would be helpful as well.

[Edited on 16-4-13 by woelen]




The art of wondering makes life worth living...
Want to wonder? Look at https://woelen.homescience.net
View user's profile Visit user's homepage View All Posts By User
watson.fawkes
International Hazard
*****




Posts: 2793
Registered: 16-8-2008
Member Is Offline

Mood: No Mood

[*] posted on 16-4-2013 at 08:08


Quote: Originally posted by woelen  
I am looking into FPGA-based solutions, but I wonder, how well do modern FPGA's compare to ASIC's?
[...]
I have seen FPGA-boards, which can be connected to a USB2 port and there also are boards, which can connect to a PCI-bus. [...] If you know of a particular cost effective and very fast FPGA, then that would be helpful as well.
Today's FPGA has the performance of an ASIC from a year or two ago. There's some performance penalty, but it's mostly in gate density.

There are always evaluation boards available from the manufacturers such as Xilinx and Altera. Pick your budget and buy an evaluation board. I don't know your application, but the difference between USB and PCI is bandwidth to and from the device. If there's a high ratio of crypto to I/O, USB is unlikely to be your bottleneck. As for CPU, you absolutely want one on-chip, because it's the easiest way to do I/O to the custom circuitry; if not, you'll have to program a CPU on the board (off-chip) to do the same thing.

"Fastest" and "best" change constantly; there's a constant stream of new devices. Don't bother with extensive evaluation. Select based on budget and availability and make your purchase the moment before you're going to start using it. If the software toolchain matters for whatever reason, than may narrow down your vendor list.
View user's profile View All Posts By User
mr.crow
National Hazard
****




Posts: 884
Registered: 9-9-2009
Location: Canada
Member Is Offline

Mood: 0xFF

[*] posted on 16-4-2013 at 09:41


How much hardware experience do you have? I can help you get started with FPGAs and Verilog if you want

EDIT: I would recommend a Xilinx Spartan based FPGA, especially for development. The others are much more higher end and expensive. Altera Cyclone would also work. Always get the latest version

A soft processor would work in place of a real CPU core. The tools are supposed to be free but I would get a pirate copy anyway so you don't have to worry about hassles.

Once you get your basic design working you can think about scale up and massive parallelization



[Edited on 16-4-2013 by mr.crow]




Double, double toil and trouble; Fire burn, and caldron bubble
View user's profile View All Posts By User
12AX7
Post Harlot
*****




Posts: 4803
Registered: 8-3-2005
Location: oscillating
Member Is Offline

Mood: informative

[*] posted on 16-4-2013 at 10:41


It's my recollection the difference between FPGA and ASIC, same code and fab, is about a doubling of speed, or better. For instance, I've heard of designers working on MPEG decoders: it had to be breadboarded over multiple FPGAs, and the delay of all those pins on the circuit board made it go tremendously slow. Like, an 800MHz core running at 60MHz, or taking hours to decode a single frame. So once you cross the boundary of a single device, the dis-economy of scale is great indeed.

But seriously, what do you need billions of hashes for, that can't simply be solved with another ten cheap computers?

Tim




Seven Transistor Labs LLC http://seventransistorlabs.com/
Electronic Design, from Concept to Layout.
Need engineering assistance? Drop me a message!
View user's profile Visit user's homepage View All Posts By User This user has MSN Messenger
woelen
Super Administrator
*********




Posts: 8013
Registered: 20-8-2005
Location: Netherlands
Member Is Offline

Mood: interested

[*] posted on 16-4-2013 at 11:28


USB indeed is not a bottleneck for my application. Actually I need connectivity to the internet, but I use a Raspberry Pi for that. It reads and writes the USB port and does the communication over the internet and lets the FPGA or ASIC do the computationally intensive stuff.

The application is tremendously heavy. I expect that I need to do appr. 10^17 hashes every few months and this boils down to several billions of hashes per second. A solution with 10 cheap PC's is no option. First, 10 PC's are not cheaper than a FPGA (they are much cheaper though than the initial cost of creating an ASIC), they consume a lot of electrical power and last but not least, even 10 beasts of machines will be MUCH slower than what I need. I really need a dedicated very special setup.

I'll have a look at the Xilinx and Alterra boards and I'll try to find one with a built-in CPU. As I wrote before, orchastration of the application probably can best be done by that CPU and it also can be used to read from/write to the host through the USB port.

If I use a raspberry PI as host, then the tool chain must run on Linux. The raspberry PI unfortunately has no Intel processor. Another option for me is to use a simple Shuttle PC as host. These are fairly cheap and consume low power. Speed of the host system is not important at all, it just has to transfer some data from the internet to the USB bus and vice versa.

I have some hardware experience (e.g. with logic gates and creating functional blocks on the basis of sets of gates and feedback of signals). I once made an approximating AD-converter, which sampled data and used an EPROM plus DA-converter and comparator and used a lookup table in the EPROM for performing a kind of interval bisection until the voltage was pinned down to one ULP. This is long ago. Nowadays I expect a FPGA to have a million or so of gates, several megabytes of fast RAM and a lot of I/O options running at hundreds of MHz. Massive parallellism in the algorithms should make it possible to perform hashing at very high rates, much faster than even 10 PCs. I do not have experience with (V)HDL or Verilog or any other hardware description language. My experience was limited to drawing schemes with lots of gates. I assume that this is not suitable for describing a hashing algorithm, so I have to teach myself something in this direction as well.

[Edited on 16-4-13 by woelen]




The art of wondering makes life worth living...
Want to wonder? Look at https://woelen.homescience.net
View user's profile Visit user's homepage View All Posts By User
12AX7
Post Harlot
*****




Posts: 4803
Registered: 8-3-2005
Location: oscillating
Member Is Offline

Mood: informative

[*] posted on 16-4-2013 at 21:47


I don't know of any FPGAs with "millions of gates", most are on the order of 10-200k. But what they call a "gate" is a general purpose cell with combinatorial and memory (flip-flop) functions, usually a 4 or 6-input look up table (LUT). So what you might otherwise need a handful of NAND chips to build can be implemented in one or a few LUTs. Note the general-purpose design makes complicated logic functions easier, so for example, a 4-input XOR gate, which would be, to say the least, tedious to implement otherwise (there are no minterms), is just another block. This will probably be an advantage with the complexity of hashing.

You also have the option of serial versus parallel action, and pipelining. Which gets that much more difficult to implement when you're just learning VHDL (or Verilog).

You can probably implement a logic-and-register chain which does the software equivalent in, say, one clock cycle per bit (analogous to, if you're familiar with it, the shift-and-accumulate multiply and divide algorithms), with some savings on gates (the very first computers were intensely bit-serial and used only a few hundred relays and tubes to implement computations, of course requiring very many cycles to complete only a single instruction). This may have more relevance when the hash is viewed as a bit stream into a state machine (which is how CRC codes work), in which case, clocking the state machine at the bitstream rate makes perfect sense.

Or you can try paralleling as much as you can -- in principle, a hash is nothing but an N-to-M combinatorial logic function, with N variable (depending on hash, perhaps N could be fixed, padding it if shorter, or using a suitable function to fold over longer data) and M fixed. This could, in principle, be implemented without a clock, but the propagation delay through the complex logic would be prohibitive (if you can write the function to begin with).

This is where pipelining comes in: if you can complete a reasonable hunk of combinatorial logic within a certain delay, then pass it to the next cell, you can get reasonable compromises between speed, logic size and overall complexity (at some expense to other complexities -- you have to manage the data coming in and out, delayed by the pipeline; you probably won't have to do all the pipeline state stuff they have to do on processors, though, because I'm guessing you'll just be blasting data in and out.

Tim




Seven Transistor Labs LLC http://seventransistorlabs.com/
Electronic Design, from Concept to Layout.
Need engineering assistance? Drop me a message!
View user's profile Visit user's homepage View All Posts By User This user has MSN Messenger
Wizzard
Hazard to Others
***




Posts: 337
Registered: 22-3-2010
Member Is Offline

Mood: No Mood

[*] posted on 17-4-2013 at 04:47


Don't ASICs need to be custom produced? FPGA might be a very cost-effective solution, especially if it's a small project.
View user's profile View All Posts By User
woelen
Super Administrator
*********




Posts: 8013
Registered: 20-8-2005
Location: Netherlands
Member Is Offline

Mood: interested

[*] posted on 17-4-2013 at 05:16


Indeed, ASICs are custom produced, that is the reason why the non-recurring engineering costs are so high. ASICs can be interesting but only if they are mass-produced and the initial cost can be divided over thousands or even tens of thousands of units.



The art of wondering makes life worth living...
Want to wonder? Look at https://woelen.homescience.net
View user's profile Visit user's homepage View All Posts By User
phlogiston
International Hazard
*****




Posts: 1379
Registered: 26-4-2008
Location: Neon Thorium Erbium Lanthanum Neodymium Sulphur
Member Is Offline

Mood: pyrophoric

[*] posted on 17-4-2013 at 07:21


It sounds like you are looking into bitcoin mining?

If not, the design goal you describe is extremely similar to what is done for 'mining' bitcoins. Undoubtedly you have heard of this digital currency in the past weeks/months as the value of bitcoins has skyrocketed (and crashed a bit recently).

By performing hashing operations, you can generate 'bitcoins', which you can sell for good money these days (at this moment $87 dollars or Eur 68,- for one BTC). However, this costs electricity, so ideally you want to do as many hashes/s as possible using as little power as possible.

Given the potential to generate interesting profits, a lot of effort has been put into this, progressing from using CPU's two years ago, to GPU's, to FPGA's and recently ASIC's.

The ASIC machines have only recently become available and there is a waiting list for buying them. The company that sells ASICs is called Avalon (http://launch.avalon-asics.com/). There are other companies that claim they have/sell them but beware that it is not clear when or even if they will actually ship them to customers.

For FPGAs, an open-source design published. I don't know where to find it but Im sure googling will help you out. There is also a forum that may be helpful, www.bitcointalk.org

Edit:
Quote:
The application is tremendously heavy. I expect that I need to do appr. 10^17 hashes every few months and this boils down to several billions of hashes per second. A solution with 10 cheap PC's is no option. First, 10 PC's are not cheaper than a FPGA (they are much


The commercial ASIC-based machine manages 65 Ghashes/sec, so that is approximately the range you are aiming for.

[Edited on 17-4-2013 by phlogiston]




-----
"If a rocket goes up, who cares where it comes down, that's not my concern said Wernher von Braun" - Tom Lehrer
View user's profile View All Posts By User
woelen
Super Administrator
*********




Posts: 8013
Registered: 20-8-2005
Location: Netherlands
Member Is Offline

Mood: interested

[*] posted on 17-4-2013 at 22:54


I'm indeed looking at the mining of bitcoins, but I do not simply want to follow the route of buying some box and crunching hashes. The solution I have in mind should be flexible and I also want something which can be used for other purposes (e.g. signal processing, performing mathematical computations).

I read of the recent introduction of ASIC-based solutions, but I have not read any success stories about that. People are warning for companies behind ASIC-based solutions as being scam. They offer equipment which can do tens of billions of hashes per second for just a few hundreds of dollars, maybe $1000 or so. Your mention is even better with 65 billion hashes per second for $1250. I hardly can believe such things, because the initial cost of an ASIC runs in the millions of dollars and a company needs to sell at least several thousands of units just to get back the initial cost. The best ready to use FPGA based solutions have speeds of a few hundreds of millions hashes per second and the cost of these FPGA boards is a few hundreds of dollars. Such a board may be interesting for me, just to play with and try my own algorithms, but I of course also need a tool chain for the FPGA and maybe I need additional hardware if I want to monitor and debug things.

I believe that with FPGA's and smarter algorithms I can get better results. The cost of the better FPGA boards, however, still holds me back. I looked into the links provided earlier and a decent board costs a few thousands of dollars.




The art of wondering makes life worth living...
Want to wonder? Look at https://woelen.homescience.net
View user's profile Visit user's homepage View All Posts By User
phlogiston
International Hazard
*****




Posts: 1379
Registered: 26-4-2008
Location: Neon Thorium Erbium Lanthanum Neodymium Sulphur
Member Is Offline

Mood: pyrophoric

[*] posted on 18-4-2013 at 01:56


The ASIC-based miners company I mentioned, Avalon, is the only one that has actually shipped working units to customers so they seem trustworthy. These machines work as advertised and those people are making excellent profits. There is a significant waiting list, and the customers that took the risk of ordering one early are now making good profits (and especially during the recent peak in bitcoin price), but this will decline as difficulty goes up.

However, these machines do one thing only. They are not at all flexible. FPGA on the other hand are extremely versatile so they seem like a more useful investment for your purposes. An additional advantage is that you can resell them should you decide you do not need them anymore some day. You can also do lots of parallel calculations on GPU's, perhaps good enough for signal processing etc), but these days they are not fast enough for mining BTC's anymore (or rather: not economical enough in terms of hashes/J)

You may also want to look into DSP's for signal processing/other purposes (but they are not good for mining bitcoins).

Do you have experience working with (digital) electronics / microprocessors etc? Otherwise, you are taking up quite a complex project to get started with, even for a clever guy such as yourself.




-----
"If a rocket goes up, who cares where it comes down, that's not my concern said Wernher von Braun" - Tom Lehrer
View user's profile View All Posts By User
woelen
Super Administrator
*********




Posts: 8013
Registered: 20-8-2005
Location: Netherlands
Member Is Offline

Mood: interested

[*] posted on 18-4-2013 at 02:35


I have experience with electronics and my job is as software architect and engineer. I have written a lot of high speed cryptosoftware in assembly, C, and Java (e.g. AES, SHA-220, SHA-256, FH-MQV shared secret generation based on groups on elliptic curves).




The art of wondering makes life worth living...
Want to wonder? Look at https://woelen.homescience.net
View user's profile Visit user's homepage View All Posts By User
phlogiston
International Hazard
*****




Posts: 1379
Registered: 26-4-2008
Location: Neon Thorium Erbium Lanthanum Neodymium Sulphur
Member Is Offline

Mood: pyrophoric

[*] posted on 18-4-2013 at 02:58


Ok, interesting background, then I can see why it is indeed realistic for you to think that you can improve upon the current designs/software! As you may have found out already there is a lot of source code available for bitcoin mining that you might want to check out, for inspiration and lest you end up, after a lot of work, with a solution that performs similarly. popular programs are p0clbm, phoenix, cgminer, but there are many others.



-----
"If a rocket goes up, who cares where it comes down, that's not my concern said Wernher von Braun" - Tom Lehrer
View user's profile View All Posts By User
12AX7
Post Harlot
*****




Posts: 4803
Registered: 8-3-2005
Location: oscillating
Member Is Offline

Mood: informative

[*] posted on 19-4-2013 at 13:54


^ If you're after hashes, an ASIC will do it. If you're after things that are orthogonal to a hash function (i.e., if you can't express a DSP as a hash function.... erm... that'd be one convoluted DSP if you did, though), then an ASIC is the worst thing. An FPGA will do whatever you configure it for, of course.

Tim




Seven Transistor Labs LLC http://seventransistorlabs.com/
Electronic Design, from Concept to Layout.
Need engineering assistance? Drop me a message!
View user's profile Visit user's homepage View All Posts By User This user has MSN Messenger
Finnnicus
Hazard to Others
***




Posts: 342
Registered: 22-3-2013
Member Is Offline


[*] posted on 20-4-2013 at 01:01


You know the Bitcoin forum has hundreds of threads devoted to this?



View user's profile View All Posts By User

  Go To Top