BitScope Unveils Raspberry Pi Cluster With 2,880 CPU Cores For LANL HPC R&D

Subject: General Tech | November 30, 2017 - 12:48 AM |
Tagged: HPC, supercomputer, Raspberry Pi 3, cluster, research, LANL

The Raspberry Pi has been used to build cheap servers and small clusters before, but BitScope is taking the idea to the extreme with a professional enterprise solution. On display at SC17, the BitScope Raspberry Pi Cluster Module is a 6U rackable drawer that holds 144 Raspberry Pi 3 single board computers along with all of the power, networking, and air cooling needed to keep things running smoothly.

Each cluster module holds two and a half BitScope Blades with each BitScope Blade holding up to 60 Raspberry Pi PCs (or other SBCs like the ODROID C2). Enthusiasts can already purchase their own Quattro Pi boards as well as the cluster plate to assemble their own small clusters though the 6U Cluster Module drawer doesn’t appear to be for sale yet (heh). Specifically each Cluster Module has room for 144 active nodes, six spare nodes, and one cluster manager node.

View Full Size

For reference, the Raspberry Pi 3 features the Broadcom BCM2837 SoC with 4 ARM Cortex A53 cores at 1.2 GHz and a VideoCore IV GPU that is paired with 1 GB of LPDDR2 memory at 900 MHz, 100 Mbps Ethernet, 802.11n Wi-Fi and Bluetooth. The ODROID C2 has 4 Amlogic cores at 1.5 GHz, a Mali 450 GPU, 2 GB of DDR3 SDRAM, and Gigabit Ethernet. Interestingly, BitScope claims the Cluster Module uses a 10 Gigabit Ethernet SFP+ backbone which will help when communicating between Cluster Modules but speeds between individual nodes will be limited by at best one gigabit speeds (less in real world, and in the case of the Pi it is much less than the 100 Mbps port rating due to how it is wired to the SoC).

BitScope is currently building a platform for Los Alamos National Laboratory that will feature five Cluster Modules for a whopping 2,880 64-bit ARM cores, 720GB of RAM, and a 10GbE SFP+ fabric backbone. Fully expanded, a 42U server cabinet holds 7 modules (1008 active nodes / 4,032 active cores) and would consume up to 6KW of power. LANL expects their 5 module setup to use around 3000 W on average though.

What is the New Mexico Consortium and LANL planning to do with all these cores? Well, playing Crysis would prove tough even if they could SLI all those GPUs so instead they plan to use the Raspberry Pi-powered system to model much larger and prohibitively expensive supercomputers for R&D and software development. Building out a relatively low cost and low power system enables it to be powered on and accessed by more people including students, researchers, and programmers where they can learn and design software that runs as efficiently as possible on massive multiple core and multiple node systems. Getting software to scale out to hundreds and thousands of different nodes is tricky, especially if you want all the nodes working on the same problem(s) at once. Keeping each node fed with data, communicating amongst themselves, and returning accurate results while keeping latency low and utilization high is a huge undertaking. LANL is hoping that the Raspberry Pi based system will be the perfect testing ground for software and techniques they can then use on the big gun supercomputers like Trinity, Titan, Summit (ORNL, slated for 2018), and other smaller HPC clusters.

It is cool to see how far the Raspberry Pi has come and while I wish the GPU was more open so that the researchers could more easily work with heterogenous HPC coding rather than just working with the thousands of ARM cores, it is still impressive to see what is essentially a small supercomputer with a 1008 node cluster for under $25,000!

I am interested to see how the researchers at Los Alamos put it to work and the eventual improvements to HPC and supercomputing software that come from this budget cluster project!

Also read:

Source: BitScope

November 30, 2017 | 03:03 AM - Posted by PushItSuperStrongContest (not verified)

Over at ExtremeTech they ran this story and one forum poster insisted that this was worthless for supercomputing even after repeatingly being told this was for software testing/development and that this Pi cluster's use case is for helping develop software that scales properly across hundreds of nodes and thousands of CPU cores.

Using a Pi cluster instead of having to use valuable time on the actual production supercomputers for that testing of the software's scaling across nodes/CPU cores and other uses such as student learning/training is exactily why Los Alamos National Laboratory and acedemic instutions like this Pi cluster product idea.

So this Pi has the exact same type of node/CPU cores topology as the really big 100 million+ dollar machines and those machines are better utilized running actual scientific workloads on production ready software that may have been tested/vetted on systems like this Pi cluster.

That poster over at ExtremeTech insisted on making the discussion into a pissing contest with the Pi cluster against some very costly multi-million dollar production supercomputers/other costly HPC systems. But a Pi cluster is a much more affordable solution for software testing and student learing and not as much about making the linpack top500.

November 30, 2017 | 04:59 AM - Posted by Tim Verry

Heh, yeah the community at ExtremeTech is not what it once was when the forum was still around :-(. That was the first forum community I joined IIRC and I was sad to see it go. You're right though this is a low cost platform that is easier for students, young researchers, programmers, and HPC architects to play around with new ideas and learn how to code for multi-node environments without "wasting" time on the expensive and power hungry supercomputer hardware that is better served doing actual work as best it can with hopefully improvements coming from the R&D once they are vetted on the cheap platform. Pi Cluster would aslo be good for smaller universities and startups wanting to get their feet wet in big data/HPC dependant apps I'd imagine?

In any case, that's a lot of Raspberry Pis!

November 30, 2017 | 06:38 AM - Posted by willmore

While I wouldn't recommend another supercomputer to stand in as the development environment for a supercomputer as that seems a bit missing the point. I will say that using Rpi3 as the nodes is pretty silly given their poor bandwidth, weak processors, and poorly thought out power circuitry. The C2 is a much better choice. They have full speed GigE, faster processors that don't throttle if you sneeze nearby, and power circuitry that doesn't fail to power them at peak load.

Unless you're looking to simulate the flaws of the Rpi3, get the C2 version, it's going to be a much better supercomputer simulator.

November 30, 2017 | 12:06 PM - Posted by Tim Verry

Yeah the C2 does seem to have much better specs. It is more expensive though (actually not having much luck finidng pricing on it but is listed at $46 on their site). 

It seems the XU4 is the new model and it has even better specs with more (and better) CPU cores, 2GB LPDDR3, and faster storage/networking/USB 3. That one doesn't look compatible with the BitScope Quattro boards though so wouldn't work in this cluster.

http://www.hardkernel.com/main/products/prdt_info.php?g_code=G143452239825#

November 30, 2017 | 04:53 PM - Posted by willmore

The XU4 predates the C2, but the XU4 uses a Samsung chip that has a much longer support life--that's why we're seeing derivative devices from HardKernel still. Take a look at the HC-1 "Home Cloud" device and the MC-1 "My Cluster"(?) devices:
https://magazine.odroid.com/article/odroid-hc1-and-odroid-mc1/

Both of those boards use the same SoC as the XU4 (and the fanless XU4Q). Since they got support in Linux 4.14 (LTS), HardKernel has comitted to using that SoC for some time to come.

This new board is stripped of most of the I/O that makes the XU4 a useful development board, but they trade that for a really good heatsinking arrangement. That custom extruded aluminum heatsink takes the place of the BitScope mounting kits. I'm sure either HK or BitScope (or the two working together) will come up with something to leverage each others strengths.

The C2 boards have been backordered for a little while now. I guess now that BitScope has announced their C2 based devices we know why!

I don't work for HK, I'm just a very satasfied customer. They're paying to get mainline Linux support for their boards and I respect that. I'm not really aware of another SBC vendor that's doing that, but I just might not know of it.

November 30, 2017 | 09:13 PM - Posted by Tim Verry

Ah that explains it then, I was guessing it was an older product since i wasnt finding it available fore purchase anywhere. Yeah I guess we now know where most of their product is going heh.

November 30, 2017 | 10:12 PM - Posted by willmore

If you want to purchase any ODROID boards, the easiest way in the USA is from Ameridroid. I've purchased from them before and always found them to be knowledgable and very friendly. Their staff talks to the people on Korea often and stays up to date with that's going on.

December 1, 2017 | 12:07 AM - Posted by Tim Verry

Good to know, their fanless XU4 looks nice as is their little mini cluster thing although I would not know what I'd do with it :-).

November 30, 2017 | 12:22 PM - Posted by ThinkScaleModelsForComputingClusters (not verified)

It's not production hardware it's testing hardware and it's not about speed is about measuring scaling ability of the software across nodes/cores. So even though 100Mb/s networking on a 10Gb/s backbone is slow compared to the big production systems the software results on the slower system's measured scaling across the nodes/cores performance can be extrapolated up to estimate how things would do on the production supercomputers.

And scaling is a relative metric so you get a teasting platform with say a few orders of magnitude less nodes/cores/networkig resources like on the Pi cluster and that scaling ratio on the smaller Pi cluster can be indicative of the exact scaling ratio performance on the production supercomputer with its orders of magnitude more nodes/cores.

So to save power and resources/money this Pi cluster is very affordable solution to build out a Pi cluster smaller scale model/smaller networking scale model of the bigger production systems and use that for software development and software scalibility analysis/testing so the software is ready for the larger production systems.

The network topology on the Pi cluster is an exact smaller scaled down/slower network scale model of the production supercomputer's network topology and faster network speed. So the Pi cluster can in fact be used for its intended functionality of testing the software's ability at effenciently scaling across nodes/cores.

Take that 100Mb/s Pi and the Pi cluster's 10Gb/s backbone and the Pi cluster's node to core ratio and multiply it by say 10 and that's the same ratio for scaling/testing a more powerful system with 1Gb/s and 100Gb/s networking infrastructure with a larger node to CPU core counts but the ratio of networking and the node to CPU core resources ratio remains the same.

So it's easy to extrapolate the scaling performance on the larger system with a scale model of the system done up with Raspberry Pi3s. And that's what the Pi cluster is all about testing software scalibility performance on the scale model Pi cluster that will scale properly on any larger system with the same node to CPU core and neworking resources ratio.

It's not a drag race between the Pi cluster as the Pi is most definitely not there for any linpack top500 sorts of metrics. The Pi cluster is a scale model of the larger systems with the Pi cluster built for testing/development of the software to be used on those larger systems. And there can be many of these Pi cluster systems spread around the academic instutions that train students and develop software for the larger production supercomputer systems with the academic instutions/supercomputer sites able to keep the actual production hardware doing actual production workloads. So this potentially represents millions of dollars saved.

November 30, 2017 | 05:03 PM - Posted by willmore

Try to be a bit more coherent, will you?

Yes, I am aware these systems aren't meant for 'production' use. They are meant for exploring the scaling implications of various characteristics of supercomputers. For example, they can use traffic shaping on their network interfaces to simulate a slower relative interconnect speed (CPU vs network), etc.

The reason I advocate the C2 instead of the Rpi3 is that the C2 has higher capabilities. Yes, that's important. Let's say you want to simulate a very high network to CPU ratio. For an Rpi3 with a slow network interface, you have to slow down the CPU to do so. For the C2, you have a factor of 10x more networking performance so that you can either collect a wider range of data, or get the data faster for a given ratio. 10x faster.

The faster CPU of the C2 does the same for low ratios of networking to CPU.

And we're not talking about a large price differential between the boards, so the greatly increased capabilities are offset by a very small price difference--most of the cost is fixed anyway. The networking, mounting, power delivery, etc., doesn't scale up with the price of the C2 boards. Even if it did, the difference is so small that it would be easily justified.

So, no, it's not a drag race, but these machines are a fixed resource as is the time of the people who use them. If you can get your data 10x faster or take 10x the amount of samples, you can get more work done in a given amount of time. That's not a neglegable cost when those people are some very bright and highly paid PhDs.

December 1, 2017 | 12:31 AM - Posted by LessOfWillmoresDaftnessSVP (not verified)

It's not about Speed is about Software testing for Node/CPU Core scaling and that can be done using the least amount of power, mush lesss than any production supercomputer and still get the software's scaling tested and working properly for running on the scaled up systems where speed is needed.

The Pi cluster is cheeper and uses the least amount of enegry for it's intended function as a software testing platform and not a production platform.

Stop digressing the discussion as the Pi cluster is not intended for running a single production weather model, nuclear model, or any other production workload. The Pi cluster system is there only for software development/testing(Node Scaling/Pi cluster scale model of the production supercomputer system's topology) in the context of the New Mexico Consortium and LANL intended usage of that Pi cluster.

You are a master at being disingenuous willmore!

December 1, 2017 | 12:32 AM - Posted by LessOfWillmoresDaftnessSVP (not verified)

Edit: mush lesss
To: much less

December 1, 2017 | 06:28 AM - Posted by willmore

I addressed your points. You ignored mine. Then you personally attack me. Go away troll.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.