Scaleway explains how it plans to use servers for a decade and why it's getting rid of RAID controllers. A trend to extend the lifespan of servers beyond the typical three- to five-year range has companies such as Microsoft looking to add a few years of use to hardware that would otherwise be retired. The latest company to adopt this strategy is Paris-based Scaleway, a cloud services provider that’s sharing details about how it plans to get a decade of use out of its servers through a mix of reuse and repair. Scaleway decided the carbon footprint of new servers is just too large – server manufacturing alone accounts for 15% to 30% of each machine’s carbon impact. Reusing existing machines, rather than buying new ones, could significantly reduce e-waste. So Scaleway decided to retrofit its 14,000 servers rather than dispose of them. Marc Raynaud, hardware support manager at Scaleway, documented the project in a blog post. Software RAID replaced RAID controllers. Scaleway found its old servers had a high RAID failure rate but were otherwise performing well. Batteries in the RAID controllers were the main source of failures. Raynaud wrote that replacing the batteries, wouldn’t lead to long-term, reliable performance, since batteries degrade over time. Instead, Scaleway decided to remove the RAID controllers, which led to a large-scale retrofitting project. “This realization led us to explore more modern server options that do not rely on hardware cards like the ones originally equipped in our older servers,” Raynaud said via email. “As one of our key objectives was to eliminate the need for these hardware RAID cards altogether, we have shifted our focus towards procuring servers that align with this goal.” So the compan is now using onboard SATA controllers to directly attach disks to its servers. “With this setup, any necessary RAID functionality can be achieved through software RAID features,” Raynaud said. Each server has to meet performance standards. The company’s goal is to achieve a high level of reliability and performance on these servers through a three-step process of qualification, testing, and validation. Scaleway started by setting performance objectives for the finished product and taking a more detailed inventory of underperforming servers, looking at variables such as where they were located, the CPU, the use catalog each server was being sold in, and what catalog they could be sold in after the retrofit was done. With that done, Scaleway needed to test its retrofitted servers to see if they could actually meet the proposed use cases. It put together a checklist for its hardware engineering team to determine the constraints and requirements for each lot of servers. Physical checks were designed to see how the servers performed in a production environment. One tests makes sure the RAID cards could be physically removed or bypassed to allow access to the disks. Read/write tests were done in all SATA modes to ensure the performance was as good as, if not better than, before the retrofit. Scaleway upgrades the RAM and validates performance, carries out a CPU performance check, and a reviews firmware versions for the BIOS, BMC, etc., to see if updates are required. Servers received memory upgrades. The company’s previous bare-metal offerings were equipped with memory configurations tailored to meet the needs of their clients at the time. However, as their needs evolved, there was a demand for more memory, so Scaleway added more. “It is worth noting that the reliability of memory DIMMs [dual in-line memory module] has significantly improved, and therefore the failure rate was not a decisive factor in our decision-making process,” Raynaud said. Scaleway also reused a lot of their existing stock of DDR3 and DDR4 DIMMs as part of the upgrade initiative. Once the checks were completed, Scaleway began to physically move all 14,000 servers to a new data center. For this process, Scaleway hired a team to unrack and move them at a pace of hundreds per week. After the servers arrived at the new data center that the upgrades – removing the RAID card, new memory, firmware updates, etc. – were performed. The servers have been relocated to multiple locations and data centers so as to optimize the network infrastructure and improve the overall performance and reliability of Scalewlay’s systems, Raynaud said. Not all servers qualified for the retrofit process, but those that didn’t were saved for spare parts. “After all, these servers are relatively old and will need maintenance in the future,” Raynaud wrote. Scaleway’s servers have an age range of seven to 10 years. Raynaud says they have been properly maintained and upgraded over time to ensure that they remain reliable and fully operational. “In the long run, these are investments that are good both for business and the environment. Resources needed to build new servers are not unlimited, and we’re proud to have developed a system to retrofit our old ones with minimal waste,” he said. “Today, our servers are used for up to 10 years — versus the industry average of three to four years — and nearly 80% of components are recycled.” Related content news High-bandwidth memory nearly sold out until 2026 While it might be tempting to blame Nvidia for the shortage of HBM, it’s not alone in driving high-performance computing and demand for the memory HPC requires. By Andy Patrizio May 13, 2024 3 mins CPUs and Processors High-Performance Computing Data Center news CHIPS Act to fund $285 million for semiconductor digital twins Plans call for building an institute to develop digital twins for semiconductor manufacturing and share resources among chip developers. By Andy Patrizio May 10, 2024 3 mins CPUs and Processors Data Center news HPE launches storage system for HPC and AI clusters The HPE Cray Storage Systems C500 is tuned to avoid I/O bottlenecks and offers a lower entry price than Cray systems designed for top supercomputers. By Andy Patrizio May 07, 2024 3 mins Supercomputers Enterprise Storage Data Center news Lenovo ships all-AMD AI systems New systems are designed to support generative AI and on-prem Azure. By Andy Patrizio Apr 30, 2024 3 mins CPUs and Processors Data Center PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe