June 2016

How can bottlenecks be prevented when processing big data?

LCL's Big Data Seminar included a survey of the latest generations of processors, DRAM and SSDs.

They were also tested extensively, and the best configurations for gaining the best results when processing big data were explained.

 

Alfredo Bonafede of HPE started by spelling out the stakes: 40 zettabytes (or 40 trillion GB) of data will be available online by 2020. This presents us with unprecedented opportunities and challenges. With such a vast amount of data around, there is a need for improved security, and the data also needs to be put to better use. Today, only a fraction of the data available is used to obtain meaningful knowledge.
However, if all privacy-sensitive information is removed from a medical database, for instance, the resulting raw data can be analysed and many interesting conclusions drawn.
HPE recommends that businesses wanting to get involved with big data should completely rethink the architecture of their servers. Big data processing applications require superfast servers with huge amounts of computing power, whereas the actual data can be kept on storage servers that do not have to be state of the art. Considerable savings can be achieved in this area by cleverly dividing the server room.
Raphael Monten of Intel came to the seminar to present the latest version of the Xeon E5 processor. This fourth version primarily offers improved performance thanks to faster encryption. Businesses working with cloud applications will be interested to note that the new processor allows for a much better orchestration of server resources: you can use it to measure how much CPU and memory is used by the cloud applications, and then use these performance parameters to achieve a more efficient balance across servers.
Intel also presented its latest series of data storage options, and in particular the latest generation of superfast flash drives.

These new SSDs were immediately put under the microscope by Wannes De Smet, IT-researcher at the Sizing Servers Lab.He set Intel's top-end models to work on a realistic workload - the kind you may come across at a Belgian business - and compared their performance with that of a more modest SSD with a more affordable price per GB.
An initial test involving a large portion of big data produced an interesting result immediately: the differences in performance remained surprisingly small. This is because big data processing is not very disk intensive. Significant differences were found, however, in the second test, in which the SSDs were unleashed on a standard transactional database. In this case, spending a few extra euros on each GB clearly paid off.

Adrien Viaud of Kingston Technology provided a more detailed explanation of the improved data integrity, performance and bandwidth (and reduced energy consumption) of the DDR4 DRAM. He also explained which configurations result in the best performance, and how you can therefore optimize server speed and capacity.

Next, Johan De Gelas, the head of the Sizing Servers Lab, tested a number of the configurations using a big data workload taken from a real-life situation. The CPU turned out to be the main bottleneck, followed by the memory. The fewest problems were encountered in the area of storage. Given this, companies that work with big data would do well to invest first and foremost in rapid processors, and invest in DRAM afterwards.
Sizing Servers also shared a number of best practices to ensure servers are configured as efficiently as possible.
This optimum configuration is approximately 35% faster than the standard out-of-the-box configuration.

The final speaker of the day was our very own Managing Director, Laurens van Reijen.
He presented his vision of the data center of the future. Since data centers such as LCL already form part of a hyperconnected ecosystem along with businesses, network providers and system integrators, connectivity will increase even more in future: large data centers will also collaborate more closely with the aim of pushing the use of the cloud to the limit. Ensuring the security of the data of every customer will, of course, continue to be the chief concern of any data center that is worthy of the name.

 

 

LCL, your partner in data center outsourcing