There are many real world applications such as XML data, parse trees in natural language processing, and protein sequences in bioinformatics that can be represented by tree structures. The conventional methods used for tree structured data classification are based on dot products on the feature vectors; this can be very expensive since those vectors can be extremely large.
Tree Kernel methods have proved to be a state of the art technique for many real world problems and they are able to process tree-based information without using an explicit representation of inputs. The main bottleneck of the current solutions of the tree kernels is the processing time and because the tree structure is complex, the timing complexity is the limiting factor for the tree kernel methods. In this project, we are going to propose a novel automata solution for convolution-based tree kernels on the AP. Our method can be applied to the applications that can be represented by the ordered and unordered labelled trees such as sentiment analysis in natural language processing.
By Elaheh Sadredini
This project focuses on regular expression matching which is playing an important role in a variety of applications, like genome sequence analysis, data mining, network inspection, etc. Different kinds of architectures such as CPU, XeonPhi, GPU, FPGA can perform regular expression matching. However, it is difficult on Von-Neumann architectures since it requires high irregular parallelism, high memory bandwidth as well as low latency.
Micron’s Automata Processor (AP) is designed for this kind of problem, using DRAM as a highly parallel reconfigurable fabric to implement NFAs. In this work we investigate for a fair comparison of best effort regular expression processing engines across all these aforementioned architectures which involves analyzing their performances with different types of regular expressions and exploring the design spaces of each of these architectures.
Our preliminary results indicate that CPU, XeonPhi and GPU are most likely bottlenecked by memory latency for rule lookup and AP and FPGA outperform other architectures due to their high capacity, their massively parallel execution and their capabilities of processing new input symbol every clock cycle. Furthermore, unlike FPGA, AP’s throughput is immune to complexity of NFA topologies and rulesets (i.e. large number of transitions, active states).
In the future work, we continue to explore the performance evaluation on more dimensions: “complexity” of regular expressions, the number of regular expressions, and multiple packets (streams) processing capability. We also extend the work to other benchmark suites that are not natural fits for regular expression, such as association rule mining, Markov chains, String kernel etc.
By Tommy Tracy
Entity Resolution (ER), the process of finding identical entities across different databases, is critical to many information integration applications. As sizes of databases explode in the big-data era, it becomes computationally expensive to recognize identical entities for all records with variations allowed across multiple databases. Profiling results show that approximate matching is the primary bottleneck.
Micron’s Automata Processor (AP), an efficient and scalable semiconductor architecture for parallel automata processing, provides a new opportunity for hardware acceleration for ER. We propose an AP-accelerated ER solution, which accelerates the performance bottleneck of fuzzy matching for similar but potentially inexactly-matched names, and use a real-world application to illustrate its effectiveness. Results show promising speedups for matching one record, with better accuracy over the existing CPU method.
By Dang, Vinh Quang
Sequence alignment refers to arranging sequences of DNA, RNA, or protein against reference sequences to identify regions of similarity in Bioinformatics. The major challenge is that the reads do not always perfectly match with references, and approximate matching is needed. The process is computationally expensive to compare large number of different reads against long references when fuzziness is allowed.
Micron’s Automata Processor (AP) is an efficient and scalable semiconductor architecture for parallel automata processing. The AP is based on an adaption of memory array architecture, exploiting the inherent bit-parallelism of traditional SDRAM. This new in-memory processing hardware architecture provides a new opportunity for sequence alignment. We use the new hardware to accelerate DNA alignment and compare with other famous sequence alignment tools (Bowtie2, Bowtie and PatMaN).
Results show at least $10$x speedup is achieved and more than 10000x speedup could be achieved when more variations are needed.
By Chunkun Bo