Machine Learning Goes Deeper with Intel Xeon Phi Processors

By Press Release

Machine learning implementations require an enormous amount of compute power to run mathematical algorithms and process huge amounts of data. With these challenges in mind, Intel has expanded its range of technologies for machine learning with the release of the Intel® Xeon Phi™ processor family. The Intel® Xeon Phi™ processoroffers robust performance for machine learning training models, and with the flexibility of a bootable host processor, it is capable of running multiple analytics workloads.Intel® Scalable System Framework-based clusters powered by the Intel Xeon Phi processors and available integrated Intel® Omni-Path Architecture, enable data scientists to run complex neural networks and run training models in significantly shorter time. In a 32-node infrastructure, the Intel Xeon Phi family offers up to 1.38 times better scaling than GPUs and in a 128-node infrastructure, the time to train models can be completed up to 50 times faster using the Intel Xeon Phi family3.

The Intel Xeon Phi family is complemented by the Intel® Xeon® processor E5 family, the most widely deployed infrastructure for machine learning4. Intel Xeon processor E5 v4 product family is well suited for machine learning scoring models and provides great performance and value for a wide variety of data center workloads. Together, these Intel Xeon processor families offer developers a consistent programming model for training and scoring and a common architecture that can be used for high-performance computing, data analytics and machine learning workloads.

Intel, the Intel logo, Xeon Phi and Omni-Path are trademarks of Intel Corporation in the United States and other countries.

*Other names and brands may be claimed as the property of others.

1 Intel® Xeon Phi™ processor delivers over 3 Teraflops of dual-precision performance, which is faster than the over 1 Teraflop of dual-precision performance for the Intel® Xeon Phi™ Coprocessor x100 Family

2 Source: Intel measured performance of Intel® Xeon Phi™ processor 7250 on STREAM Triad benchmark in Gigabytes/second as of March 2016.

3  Up to 50x faster training on 128-node as compared to single-node based on AlexNet* topology workload (batch size = 1024) training time using a large image database running one node Intel Xeon Phi processor 7250 (16 GB MCDRAM, 1.4 GHz, 68 Cores) in Intel® Server System LADMP2312KXXX41, 96GB DDR4-2400 MHz, quad cluster mode, MCDRAM flat memory mode, Red Hat Enterprise Linux* 6.7 (Santiago), 1.0 TB SATA drive WD1003FZEX-00MK2A0 System Disk, running Intel® Optimized DNN Framework.  Contact your Intel representative for more information on how to obtain the binary. Up to 38% better scaling efficiency at 32-nodes claim based on GoogLeNet deep learning image classification training topology using a large image database comparing one node Intel Xeon Phi processor 7250 (16 GB MCDRAM, 1.4 GHz, 68 Cores) using the same configuration as above., , Intel® Optimized DNN Framework with 87% efficiency to unknown hosts running 32 each NVIDIA Tesla* K20 GPUs with a 62% efficiency (Source:

Intel estimate based on internal Intel Xeon E5 processor sales data and customer feedback.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to

Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See for details.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can provide absolute security. Cost reduction scenarios described are intended as examples of how a given Intel- based product, in the specified circumstances and configurations, may affect future costs and provide cost savings.  Circumstances will vary. Intel does not guarantee any costs or cost reduction.

All dates and products specified are for planning purposes only and are subject to change without notice.

Relative performance for each benchmark is calculated by taking the actual benchmark result for the first platform tested and assigning it a value of 1.0 as a baseline. Relative performance for the remaining platforms tested was calculated by dividing the actual benchmark result for the baseline platform into each of the specific benchmark results of each of the other platforms and assigning them a relative performance number that correlates with the performance improvements reported.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

Don't Miss ( 1-5 of 25 )