The Automata Processor (AP)

The Automata Processor (AP) is the first successful hardware implementation of a non-deterministic finite automata computing architecture that is especially well-suited for complex pattern matching on unstructured data.  This video conceptually explains how the AP leverages the memory-based state machine approach to perform parallel data mining quickly and efficiently.

Member Testimonial

Although the Micron Automata Processor (AP) was originally designed for text-based searches, I immediately realized its potential in addressing other classes of pattern recognition problems in Experimental High Energy Physics (HEP).  Over the past year, a number of us at Fermilab have been collaborating with researchers at the the University of Virginia’s Center for Automata Processing to explore the suitability of the AP in particle track finding applications in HEP experiments.  We have successfully demonstrated a proof-of-concept and are now looking into its use in signal processing applications in data acquisition systems in HEP experiments.

Dr. Michael Wang

Physicist at Fermilab

CAP Publications, Reports, and Presentations

RAPID Programming of Pattern-Recognition Processors

Angstadt, W. Weimer, and K. Skadron. Proceedings of the ACM International Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2016, to appear.

Brill Tagging on the Micron Automata Processor

K. Zhou, K. Wang, J. Fox, and D. Brown. IEEE International Conference on Semantic Computing (ICSC 2015).

Entity Resolution using the Micron Automata Processor

C. Bo, K. Wang, J. Fox, and K. Skadron. 5th International Workshop on Architectures and Systems for Big Data (ASBD), in conjunction with the 42nd International Symposium on Computer Architecture (ISCA 2015).

Cellular Automata on the Micron Automata Processor

K. Wang and K. Skadron; University of Virginia Technical Report #CS-2015-03.


Calendar Of Events

ISC-HPC 2016

CAP Graduate Research Assistant Tommy Tracy will present his work on implementing random forest machine learning on the AP at ISC-HPC. June 19-23, 2016. Frankfurt, Germany

SC 2016

Salt Lake City, Utah, November 13-18 2016

Our Webinars

Click HERE for the full schedule and information on how to participate live.

About CAP

Academic Membership Application


Your Name (required)

Your Email (must be a business/organization email)

Your Phone Number

Your Full Address

Your Question(s) And/Or Interested Membership Level

Machine Learning Application: Tree Kernels


There are many real world applications such as XML data, parse trees in natural language processing, and protein sequences in bioinformatics that can be represented by tree structures. The conventional methods used for tree structured data classification are based on dot products on the feature vectors; this can be very expensive since those vectors can be extremely large.

Tree Kernel methods have proved to be a state of the art technique for many real world problems and they are able to process tree-based information without using an explicit representation of inputs. The main bottleneck of the current solutions of the tree kernels is the processing time and because the tree structure is complex, the timing complexity is the limiting factor for the tree kernel methods. In this project, we are going to propose a novel automata solution for convolution-based tree kernels on the AP. Our method can be applied to the applications that can be represented by the ordered and unordered labelled trees such as sentiment analysis in natural language processing.

By Elaheh Sadredini

Performance Evaluation of Regular Expression Engines Across Different Computer Architectures


This project focuses on regular expression matching which is playing an important role in a variety of applications, like genome sequence analysis, data mining, network inspection, etc. Different kinds of architectures such as CPU, XeonPhi, GPU, FPGA can perform regular expression matching. However, it is difficult on Von-Neumann architectures since it requires high irregular parallelism, high memory bandwidth as well as low latency.

Micron’s Automata Processor (AP) is designed for this kind of problem, using DRAM as a highly parallel reconfigurable fabric to implement NFAs. In this work we investigate for a fair comparison of best effort regular expression processing engines across all these aforementioned architectures which involves analyzing their performances with different types of regular expressions and exploring the design spaces of each of these architectures.

Our preliminary results indicate that CPU, XeonPhi and GPU are most likely bottlenecked by memory latency for rule lookup and AP and FPGA outperform other architectures due to their high capacity, their massively parallel execution and their capabilities of processing new input symbol every clock cycle. Furthermore, unlike FPGA, AP’s throughput is immune to complexity of NFA topologies and rulesets (i.e. large number of transitions, active states).

In the future work, we continue to explore the performance evaluation on more dimensions: “complexity” of regular expressions, the number of regular expressions, and multiple packets (streams) processing capability. We also extend the work to other benchmark suites that are not natural fits for regular expression, such as association rule mining, Markov chains, String kernel etc.

By Tommy Tracy

Entity Resolution Acceleration Using AP

Entity Resolution (ER), the process of finding identical entities across different databases, is critical to many information integration applications. As sizes of databases explode in the big-data era, it becomes computationally expensive to recognize identical entities for all records with variations allowed across multiple databases. Profiling results show that approximate matching is the primary bottleneck.

Micron’s Automata Processor (AP), an efficient and scalable semiconductor architecture for parallel automata processing, provides a new opportunity for hardware acceleration for ER. We propose an AP-accelerated ER solution, which accelerates the performance bottleneck of fuzzy matching for similar but potentially inexactly-matched names, and use a real-world application to illustrate its effectiveness. Results show promising speedups for matching one record, with better accuracy over the existing CPU method.

By Dang, Vinh Quang

Sequence alignment in Bioinformatics using Micron’s Automata processor:

Sequence alignment refers to arranging sequences of DNA, RNA, or protein against reference sequences to identify regions of similarity in Bioinformatics. The major challenge is that the reads do not always perfectly match with references, and approximate matching is needed. The process is computationally expensive to compare large number of different reads against long references when fuzziness is allowed.

Micron’s Automata Processor (AP) is an efficient and scalable semiconductor architecture for parallel automata processing.  The AP is based on an adaption of memory array architecture, exploiting the inherent bit-parallelism of traditional SDRAM. This new in-memory processing hardware architecture provides a new opportunity for sequence alignment. We use the new hardware to accelerate DNA alignment and compare with other famous sequence alignment tools (Bowtie2, Bowtie and PatMaN).

Results show at least $10$x speedup is achieved and more than 10000x speedup could be achieved when more variations are needed.

By Chunkun Bo