New offerings from Intel & AMD, Facebook’s lead in machine translation, Eta Compute — Issue #15

Al Gharakhanian
4 min readAug 9, 2019

The overarching objective of this newsletter is to highlight key Deep Learning technological developments and innovations that impact the underlying hardware, including accelerator chips, IP, accelerator cards, tools, memory, and systems.

  1. Intel debut “Ice Lake” processor intended for laptops. This is the first processor based on 10nm process technology and supports sizable deep learning capabilities thanks to 64 GPU execution engines. Separately Intel unveiled high-performance PAC D5005 accelerator PCIe card for compute-intensive applications in data centers such as AI inference, media transcoding, and streaming analytics. The product is based on Intel’s Stratix® 10 SX FPGAs and will initially be available for HPE ProLiant DL380 Gen10 servers
  2. Thanks to EEMBC (the Embedded Microprocessor Benchmark Consortium), we now have a new benchmark intended to measure the performance of inference chips for embedded edge applications. MLMark uses three of the most common object detection and image classification models (ResNet-50, MobileNet and SSDMobileNet). Unlike existing benchmarks (MLPerf and DAWNBench), MLMark is strictly intended for embedded edge applications and measures three metrics: throughput (inferences per second), latency (processing time for one inference) and accuracy (percentage of correct predictions)
  3. Facebook’s AI models came on top in multiple language processing tasks including the news translation competition hosted in the fourth Conference on Machine Translation (WMT19). The most notable lead was in English to German translation which happens to be the most competitive task in the contest. The performance lead can be attributed to large-scale sampled back-translation, noisy channel modeling, and data-cleaning techniques (details here)
  4. Google announced a series of image classification models generated by EfficientNets optimized for Edge TPU. Now a translation. EfficientNets can be viewed as an automated way (using AutoML) of designing a CNN model optimized for a given hardware platform and desired performance metrics. Edge TPU is a stripped down and low power version of Google’s TPU intended for edge applications. In a nutshell, one can view this development as declaring that Edge TPUs can tackle vision workloads with performance that comes close to TPUs sitting in enormous data centers (details here)
  5. A day after Intel launched its second-generation Programmable Acceleration Card (PAC D5005, see above) for the data center, Xilinx on Tuesday announced the new Alveo U50 accelerator card with PCIe 4.0 and HBM. The product is based on Xilinx’ 16nm Ultrascale+ architecture (not the newer Versal architecture). The target use cases for this beast are AI inference, video processing, data analytics, and fintech applications
  6. AMD launched the 2nd generation of Epyc 7002 server processor series (Rome) intended for data center applications. The Epyc processor features 8 to 64 Zen 2 cores, combining up to eight 7-nanometer chiplets with an 14-nanometer I/O die. The Rome chip has doubled the performance per socket compared to the first generation of Epyc. AMDhas clearly outdone Intel when it comes to pushing the performance envelope and utilization of the latest process technologies. Despite their aggressive push in this market, they only commands less than 10% of the server market. Company’s target is to achieve double-digit market share within the next two years. Intel’s strategy to defend its commanding market share rotates around product diversity. They have expanded their portfolio beyond CPUs and offer adjacent technologies such AI accelerators (Nervana Neural Network Processors), FPGAs, FPGA accelerator cards, NAND storage, networking cards, silicon photonics, among other technologies
  7. VeriSilicon has joined the ranks of several other IP companies such as CEVA, Imagination Technology, Cadence, Synopsys, and ARM and has launched a new generation of Neural Processor Unit IP. VIP9000, is a highly scalable and programmable processor for computer vision and artificial intelligence applications

As for AI chips . . . .

Eta Compute

Had the opportunity to get a briefing from executives at Eta Compute. The company is based in Westlake Village offering very low power AI accelerator chips for a variety of AI edge inference applications most notably voice processing, sensor applications among others.

They have chosen a unique architectural philosophy and have shunned away from developing AI-specific custom compute arrays. instead, they utilize proven and popular processor and DSP IPs. The core of their chip consists of an ARM Cortex-M3 bolted to NXP Coolflux DSP processor IP. This approach has enabled them to utilize a tremendous amount of tools and resources that are available for both of these popular cores. No need to reinvent the wheel.

Impressive power numbers. Able to support voice recognition models with less than 1mW in active mode including the microphone and ADC and 1uA in rest mode . Remarkable power numbers have been achieved by leveraging Dynamic Voltage Scaling, asynchronous design techniques, and sub-thresholddesign methodology.

While their platform can cater to many edge applications, voice processing seems to be front and center. Their solution goes above and beyond just “keyword” detection and is able to tackle intense voice recognition use cases. They view the support for RNNs and LSTMs crucial for voice applications and have chosen an architecture augmented with elaborate tools serving such models.

Another major differentiation of their product is integrating various application specific building blocks such as analog to digital converts (ADC) for sensor interface, power management, as well as a plethora of interfaces (i.e. I2C, I2S, GPIOs, …).

They also have done well when it comes to strategic partnerships. They have forged a partnership with Rohm Semiconductor to enable their wireless sensor initiative as well as TDK and Vesper on the microphone for speech applications.

Another company in a crowded space that gets it.

Resources

An outstanding and comprehensive list of AI educational resources, tutorials, landmark papers, and a lot more by MONTRÉAL.AI (link). If you can’t find a specific topic in this collection, you probably don’t need to know about it.

========================================

Hope you have benefited from this issue. Please forward to others if you find value in this content. I always welcome feedback.

Al Gharakhanian

info@cogneefy.com | www | Linkedin | blog | Twitter

--

--