On April 10, 2019, the Event Horizon Telescope published the first black hole image, a historic and monumental achievement that included lots of IT wizardry.
The Event Horizon Telescope (EHT) includes telescopes spread across the world at eight different sites, including the South Pole. Each site captures massive amounts of radio signal data, which goes to processing centers at the MIT Haystack Observatory in Westford, Mass., and the Max Planck Institute for Radio Astronomy in Bonn, Germany.
The data for the now-famous black hole image — captured in 2017 from galaxy M87, 53 million light-years away — required around 3.5 PB of storage. It then took two years to correlate the data to form an image.
The project’s engineers had to find a way to store the fruits of this astronomy at a cost that was, well, less than astronomical.
EHT’s IT challenges included finding the best way to move petabyte-scale data from multiple sites, acquiring physical media that was durable enough to handle high altitudes and a way to protect all of this data cost-effectively.
The cloud is impractical
Normally, the cloud would be a good option for long-term storage of unifying data sourced from multiple, globally distributed endpoints, which was essentially the role of each individual telescope. However, EHT data scientist Lindy Blackburn said cloud is not a cold storage option for the project.
Each telescope records at a rate of 64 Gbps, and each observation period can last more than 10 hours. This means each site generates around half a petabyte of data per run. With each site recording simultaneously, Blackburn said the high recording speed and sheer volume of data captured made it impractical to upload to a cloud.
“At the moment, parallel recording to massive banks of hard drives, then physically shipping those drives somewhere is still the most practical solution,” Blackburn said.
It is also impractical to use the cloud for computing, said Geoff Crew, co-leader of the EHT correlation working group at Haystack Observatory. Haystack is one of EHT’s two correlation facilities, where a specialized cluster of computers combine and process the radio signal data of the telescopes to eventually form a complete black hole image.
There are about 1,000 computing threads at Haystack working on calculating the correlation pattern between all the telescopes’ data. Even that is only enough to play back and compute the visibility data at 20% of the speed at which the data was collected. This is a bottleneck, but Crew said using the cloud wouldn’t speed the process.
“Cloud computing does not make sense today, as the volume of data would be prohibitively expensive to load into the cloud and, once there, might not be physically placed to be efficiently computed,” Crew said.
Crew added that throwing more hardware at it would help, but time and human hours are still spent looking at and verifying the data. Therefore, he said it’s not justifiable to spend EHT’s resources on making the correlators run faster.
Although Blackburn concluded physically transporting the data is currently the best option, even that choice presents problems. One of the biggest constraints is transportation at the South Pole, which is closed to flights from February to November. The cost and logistics involved with tracking and maintaining a multipetabyte disk inventory is also challenging. Therefore, Blackburn is always on the lookout for another method to move petabyte-scale data.
“One transformative technology for the EHT would be if we could send out raw data directly from the telescopes via high-speed communication link, such as via satellite laser relay, and bypass the need to move physical disks entirely,” Blackburn said. “Another more incremental advancement would be a move to solid-state recording, which would be lighter, faster and more compact. However, the timeline for that would depend entirely on the economics of SSD versus magnetic storage costs.”
Using helium hard drives
Another problem EHT ran into regarding the black hole image data was the frequency at which traditional hard drives failed at high altitudes. Vincent Fish, a research scientist at Haystack who is in charge of science operations, logistics and scheduling for EHT, said each EHT telescope ranged from 7,000 feet above sea level to 16,000 feet.
“For years, we had this problem where hard drives would fail,” Fish said. “At high altitudes, the density of air is lower, and the old, unsealed hard drives had a high failure rate at high altitudes.”
Vincent FishResearch scientist, MIT Haystack Observatory
The solution came in the form of helium hard drives from Western Digital’s HGST line. Hermetically sealed helium drives were designed to be lighter, denser, cooler and faster than traditional hard drives. And because they were self-contained environments, they could survive the high altitudes in which EHT’s telescopes operated.
“The industry ended up solving this problem for us, and not because we specifically asked them to,” Fish said.
EHT first deployed 200 6 TB helium hard drives in 2015, when it was focused on studying the black hole at Sagittarius A* (pronounced Sagittarius A-Star). Blackburn said EHT currently uses about 1,000 drives, some of which have 10 TB of capacity. It also has added helium drives from Seagate and Toshiba, along with Western Digital.
“The move to helium-sealed drives was a major advancement for the EHT,” Blackburn said. “Not only do they perform well at altitude and run cooler, but there have been very few failures over the years. For example, no drives failed during the EHT’s 2017 observing campaign.”
No backup for raw data
After devising a way to capture, store and process a massive amount of globally distributed data, EHT had to find a workable method of data protection. EHT still hasn’t found a cost-effective way to replicate or protect the raw radio signal data from the telescope sites. However, once the data has been correlated and reduced to about tens of petabytes, it is backed up on site on several different RAID systems and on Google Cloud Storage.
“The reduced data is archived and replicated to a number of internal EHT sites for the use of the team, and eventually, it will all be publically archived,” Crew said. “The raw data isn’t saved; we presently do not have any efficient and cost-effective means to back it up.”
Geoff CrewCo-leader of the EHT correlation working group, MIT Haystack Observatory
Blackburn said, in some ways, the raw data isn’t worth backing up. Because of the complexity of protecting such a large amount of data, it would be simpler to run another observation and gather a new set of data.
“The individual telescope data is, in a very real sense, just ‘noise,’ and we are fundamentally interested only in how much the noise between telescopes correlates, on average,” Blackburn said. “Backing up original raw data to preserve every bit is not so important.”
Backing up the raw data for the black hole image may become important if EHT ends up sitting on it for long periods of time as a result of the computational bottlenecks, Blackburn admitted. However, he said he can’t seriously consider implementing a backup process unless it is “sufficiently straightforward and economical.”
Instead, he said he’s looking at where technology might be in the next five or 10 years and determining if recording to hard drives and shipping them to specialized processing clusters will still be the best method to handle petabyte-scale raw data from the telescopes.
“Right now, it is not clear if that will be continuing to record to hard drives and using special-purpose correlation clusters, recording to hard drives and getting the data as quickly as possible to the cloud, or if SSD or even tape technology will progress to a point to where they are competitive in both cost and speed to hard disks,” Blackburn said.
Fish suggested launching a constellation of satellites via spacecraft rideshare initiatives, either through NASA or a private company, isn’t entirely out of reach, either. Whether it’s the cloud or spaceships, the technology to solve EHT’s petabyte-scale problem exists, but cost is the biggest hurdle.
“Most of our challenges are related to insufficient money, rather than technical hurdles,” Crew said.
Go to Original Article