# Optimization of re-configurable multi-core processors and security based on field programmable gate arrays

Prashant Bachanna<sup>1</sup>, Palla Hari Sankar<sup>2</sup>, Mukesh Kumar Tripathi<sup>3</sup>, Shivendra<sup>4</sup>, Kadali Ravi Kumar<sup>5</sup>, Nilesh Bhosle<sup>6</sup>

<sup>1</sup>Department of Electronics and Communication Engineering, Institute of Aeronautical Engineering, Hyderabad, India <sup>2</sup>Department of Mechanical engineering, G. Pulla Reddy Engineering College (Autonomous), Kurnool, India <sup>3</sup>Department of Computer Science and Engineering, Vardhaman College of Engineering, Hyderabad, India <sup>4</sup>Department of BCA, D.K College, Dumraon, India

<sup>5</sup>Department of Electrical and Electronics Engineering, Vasavi College of Engineering, Hyderabad, India <sup>6</sup>Department of Information Technology, Trinity Academy of Engineering, Pune, India

# Article Info

Article history:

Received Apr 5, 2023 Revised Oct 30, 2023 Accepted Nov 20, 2023

## Keywords:

AHB protocol Clock domain crossing FPGA Machine learning Support vector machines Traffic generator design

## ABSTRACT

In system-on-a-chip based complex processors has the problem of multithreading and miss-functionality due to their complexity and high-speed operations. In order to minimize these problems, the proposed design has machine learningbased algorithms and cryptography systems for security has been incorporated. In the proposed work, the security level has been taken care of in three different stages such as data integrity, data authentication, and private and public keys encryption and decryption. In order to increase throughput with minimal latency, the proposed architecture with advanced high-performance protocol and advanced high-performance and advanced peripheral bus bridge is incorporated between the fabric dynamically re configurable multi-processor and peripherals along with security algorithms using secure hash algorithm (SHA-256) bits and advanced encryption standard (AES). In order to perform machine learningbased applications, the proposed system is incorporated double-precision floating point arithmetic operations. The overall proposed architecture is developed in verilog hybrid deep learning (HDL) and quality checking using the LINT tool. The entire design is interfaced with the Zynq processor and software development kit (SDK) tool to verify data transfer between hardware and software. The obtained results are compared with existing state-of-art results and found that 18% improvement in throughput, a 21% improvement in power consumption savings, and a 34% reduction in latency.

This is an open access article under the <u>CC BY-SA</u> license.



568

## **Corresponding Author:**

Prashant Bachanna Department of Electronics and Communication Engineering, Institute of Aeronautical Engineering Hyderabad, India Email: b.prashant@iare.ac.in

# 1. INTRODUCTION

Communications in on-chip interconnections networks can have benefits by implementing multiprocessor with network-on-a-chip routing. The proposed work will show that by reducing the input/output buffers remarkable reduction in energy consumption can be achieved without compromising the performance losses, loss in performance may be negligible compared to algorithms designed by buffered routing methods, bearing in mind the traffic volume is low, suitable for most of the real-world applications. The performance

of the router can be increased by a reduction in multi-processor with network-on-a-chip routing, in turn, will also enable reduced in router latency. It is evident that algorithms designed using the multi-processor with network-on-a-chip system will simplify the router design by reducing the complex buffer allocation and management techniques, so this method will be of choice to design an interconnection network designed to run well always for reduced peak throughput. Reduced buffer network design will dearth many functionality designed to work well with buffered networks like quality of service, different traffic classes, support for starvation avoidance/freedom, congestion awareness tolerance in faults on the occurrence of faulty routers/links, and management of energy. In this proposed work we concentrate on implementing support into the design of bufferless routing algorithms [1]. Nowadays, artificial intelligence is becoming a major part of clarifications for different applications, and its sub-system is machine learning and these two makes the excellent decision for big data sets. A set of techniques for creating equations usually referred to as predictive systems that dynamically identify patterns in data with the overall goal of predicting future data or carrying out various types of decision-making. When compared to systems using general-purpose central processing units or even generalpurpose graphics processing units, the usage of reconfigurable hardware has been shown to provide significant speedups [2]. By fully utilizing the capacity to adapt the hardware architecture to the unique characteristics of real problem situations, speedups are achieved. Reconfigurable hardware creates a connection between hardware and software, potentially achieving performance levels that are significantly higher than those of software while retaining a higher degree of flexibility than hardware [3].

A field-programmable gate array is the most popular kind of reconfigurable hardware device because of its adaptability, hardware-timed speed, dependability, and parallelism. Reconfigurable integrated circuits called field-programmable gate array have a variety of programmable logic blocks inside. In addition to the benefits already described, some field-programmable gate array families include the ability to reconfigure a portion of the chip while other parts of it continue to function. The application of this feature in accelerated computing is widespread. field-programmable gate array are a perfect fit for many different markets thanks to their programmable nature, and they are particularly intriguing in the context of human behavior identification. Another advantage is the quick response time and bandwidth savings that come from being able to complete computing jobs immediately. A group of classifiers is built using ensemble methods, which classify fresh data by weighing the predictions of the classifiers [4]. The authors claim that this kind of method frequently outperforms the use of a single classifier.

Additionally, ensembles are a well-known technique for merging less accurate classifiers to create highly accurate ones. Human activity recognition allows for the detection of present activity using sensor data that can be gathered via accelerometers or wearable sensors. Due to the accessibility of sensors and accelerometers, the low cost and low power consumption, and the development of machine learning, this issue has emerged as one of the most popular research areas [5]. Human activity recognition may identify a variety of actions, including walking, standing, lying down, moving up or down stairs, and sitting. When used in conjunction with other technologies, it can be extensively employed for healthcare and eldercare, for recognizing driving behaviors that promote safe travel, and for recognizing military activity, among other uses. The importance of the application domains places a significant demand on the classification of human behavior accuracy, as well as reduced execution time and power consumption. In order to provide a suitable reaction time, feature extraction and learning algorithms should be carefully designed. The combination of all these limitations and the requirement to execute intricate algorithms in order to obtain acceptable classification rates creates a difficult design problem in the field of embedded systems [6].

The performance and planning of the network-on-chip are purely based on the design of efficient and high-performance routers. This proposed work shows the use of reduced deflection chipper, for buffer-less deflection routers using two protective domain name service having identical sets of inputs and initiated differently, to obtain maximum productive outcomes at the output ports. This preferred design will yield better performance by choosing the protective domain name (PDN), which will enable us to get superior performance by choosing the PDN, in turn, PDN will give much more productive output ports, which in turn reduces the deflection rate of flits. Compared with state-of-the-art buffer-less deflection routers the average flit latency is reduced for the same critical path latency. It is suggested to implement and validate a hardware design in order to solve the complexity issue while enhancing classification speed performance and preserving low power consumption. In order to facilitate the dynamic deployment of numerous accelerator cores, this project intends to design and assess a comprehensive field programmable gate arrays-based (FPGA), run-time reconfigurable architecture. To help with comprehension, the major goal can be separated into three separate goals

as seen throughout the following lines. Create an architecture based on an FPGA that allows for dynamically reconfigurable accelerator cores. Utilize and assess the FPGA-based architecture's scalability to accommodate numerous dynamically reconfigurable accelerator cores. Utilizing an FPGA's hardware reconfigurability and parallelism capabilities will increase classification speed performance compared to a classifier running on general-purpose central processing unit. Support vector machines (SVM) search for a hyperplane that clearly categorizes the data points in an N-dimensional space, where N is the number of characteristics.

Data points located on either side of a hyperplane can be assigned to various classes by using them as decision boundaries. Support vectors are the data points that are located closer to the hyperplane. The hyperplane's position and orientation are controlled by these vectors. The goal is to identify a plane with the largest margin, or the greatest separation between data points belonging to distinct classes. To improve the accuracy of future data point classification, the margin distance should be maximized. The classifiers assessed in [7], were put into practice based on previously written publications. The K-nearest neighbors algorithm method was evaluated using various neighbor counts. When the number of neighbors was set to 18, the best outcome was seen. Equal weight was given to each neighbor. In addition to the deep trench capacitor (DTC), a random forest ensemble approach made up of 70 DTCs was used. The class with the highest votes becomes the model's forecast. Each tree in the random forest predicts a different class. The machine learning processor, which included three hidden layers, was developed using Keras. The implemented convolutional neural network for network on chip (NoC) application has three repeats of the maximum pooling layer, three convolution layers, and a fully linked layer prior to the output layer. Moore's Law has largely come to a stop, according to an earlier study [8], and while this may be debatable, it cannot be denied that all technologies eventually reach their saturating point. The scaling of CPU performance and efficiency is subject to Moore's Law. Alternative architectures, such as domain-specific hardware accelerators, are one of the few methods to continue growing the performance and efficiency of computer hardware [9]. Domain-specific accelerators use the following four methods to increase performance and efficiency, according to [10], data specialization, parallelism, local, optimized memory, and decreased overhead are the four main concepts. Performance and effectiveness can both be enhanced by using specialized reasoning on domain-specific data. Performance improvements are made when parallelism is high.

## 2. LITERATURE SURVEY

With rapid improvements in cloud computing and IoT, requirements arise for high-performance and power-efficient design to handle enormous data. Recent developments designed for basic fundamentals have had a performance gap in achieving a rapid increase in the amount of data to be processed and its speed of processing. A multi-processor with NoC will initiate packet-switched fabric used for communication on-chip and is the default for different core interconnections mechanisms; this can be referred as an important reference for multiple applications that will have noteworthy energy efficient systems.

These are all challenges of NoC as multi-processor with NoC, which help to achieve congestion considerably. This congestion will affect power consumption and consumes large bandwidth. In this proposed work we are designing a hybrid multi-processor with NoC it is in the form of a framework, a combination of bufferless and buffered NoC based on the framework, this proposed work is based on knowing the applications and their performance demands [11]. Trace-driven simulators are employed to generate data similar to data available in big data applications. By comparing results obtained by traditional buffered multi-processor with NoC, this proposed method of hybrid multi-processor with NoC shows significant enhancement of performance in mixed applications by seventeen percent on average and by twenty-four percent at most, fairness improvement is achieved by little more than thirteen percent and the reduction in power consumption is achieved by thirty-eight percent [12].

This system will have an added advantage in handling livelock and deadlock freedom using MaS algorithm and is proved. Substantially simulations that are accurate are generated and that shows MaS has better performance in comparison with BLESS-worm. Simulated results are evident and it gives shows power consumption reduces by 9% and average packet latency is lowered by 10% respectively, it also needs fewer buffer requirements at the receiver end which is up to 80% less compared to BLESS-worm. For enhanced scenario real traffic analyses has to be conducted, additionally, the routing algorithm is to be assessed in other priority ranking policies to get the most deflecting-first method [13]. The emerging form of multi-processor with NoC is a prototype for on-chip communications. To be specific, chips designed need less power consumption and

are if buffer-less multi-processor with NoCs are employed in the design, but have a considerable problem in scheduling to achieve full capacity. In this proposed work, we are focusing on-chip communication firstly by understanding buffer-less multi-processor with NoCs capacity [14]. Precisely this method is giving an optimal solution for periodic schedules for many buffer-less on-chip networks having complete-exchange data-flow patterns.

This proposed data schedule is specific and fits along programming models distributed along the network and also control mechanisms used for control network congestion. We make use of more suitable greedy scheduling algorithms for normal traffic patterns this will be better than greedy online algorithms available online and avoids deadlocks. Ultimately, we can increase the performance of our suggested algorithms using simulations and present the improvement the throughout up to 35 on a torus network [15]. The proposed solution for power bottlenecks and speed on chips is the use of photonic waveguides. This proposed solution is obtained by nanophotonic technological advancements combining high-radix multi-processor with NoCs with on-chip waveguides will get us better performance multi-processor with NoC. The proposed use of bufferless photonic clos network to get max utilization of silicon photonics. The proposed sustained and informed dual round-robin matching algorithm is the proposed scheduling algorithm is the solution for fan out contention problems, distributed and informed path allocation is the path allocation scheme is the solution to clos network routing problem, and by this method, we can also achieve efficient off-chip laser-power budget [16].

To solve online fault diagnosis, it is a hybrid automatic repeat request to identify both permanent and transient faults, forward error correction link-level error control scheme and also reinforcement-learning-based fault-tolerant deflection routing (FTDR), this is designed to tolerate live lock and deadlock. Obtained results prove that using FTDR-H and FTDR routers will help to reduce the area consumption on the chip by 27% in an 8X8 network [17]. Under synthetic workload conditions, having a permanent link fault and the throughput of an 8X8 network if with these specifications are simulated and results are obtained then those results show that the networks FTDR-H and FTDR algorithms are 23% and 14% higher on average compared to fault on neighbor (FoN) aware the cost-based deflection routing algorithm and deflection routing algorithm respectively. By taking data from real-world scenario FTDR-H algorithm will obtain 20% reduced hop counts on average than the fault-on-neighbor algorithm. This work also helps to implement and achieve fault-tolerance deflection routers with 400 MHz in TSMC 65-nM technology [18]. The proposed work is on QORE –a fault-for-bearing NoC with multi-function channel (MFC) buffers.

MFC buffers will consume less power for routers and performance is also improved by removing in-route buffering. Simulations obtained using synthetic traffic mixes and real benchmarks will exhibit that QORE will enhance the throughput by 2.3 GHz and speedup by 1.3 GHz compared with Vicis and Ariadne multi-processor with NoC designs known for fault-tolerant designs. By making use of a compiler such as synopsys design compiler our QORE design reduces the power consumption of the network by 21% with minimal control overhead [19]. During minimal traffic, this proposed router will have low latency and also allow the arriving packets to bypass effectively through these shared queues. Results obtained from experiments done on standard-cell of complementary metal-oxide-semiconductor. designed in 65-nm technology show that over synthetic traffics 17% less latency is achieved in RoShaQ and it also has 18% higher saturation throughput than a normally used virtual channel router. In real-world multitasking applications and E3S embedded benchmarks using near-optimal mapping algorithm, when compared with VC router RoShaQ has 32% less latency, for the same throughput computing with 30% lower energy per packet [20].

The obtained experimental results, also show that the proposed algorithm is tremendously fast and remarkable improvements in performance are achieved, compared to conventional uniform buffer allocation. By using the proposed algorithm of smart buffer allocations in a complex application involving data from video/audio applications, about 80% savings in buffering resources can be achieved [21]. The considerable source of thermal hot spots in any three-dimensional multi-processor with network-on-chip will limit the performance gain of three-dimensional integration. To reach optimal configuration a systematic flow in the design is proposed. In the proposed system design the achieved throughput can be improved to 45.2% from 2.7% obtained from traffic-thermal co-simulation experiments [22].

The proposed method involves the fault analysis of the inverter switches present in the multi-level inverter circuitry. The decision tree machine learning algorithm is incorporated for the fault analysis of the inverter switches. The non-carried-based digital pulse-width modulator (DPWM) generation is generated using the event angle for the 7-level of the switched ladder inverter. The proposed method investigates the stuck-at-fault occurrences of the 4 switches in the inverted by manipulating the decision tree parameters such as entropy,

information gain, and decision tree [23]. proposed physical unclonable function have drawbacks, such as high delay imbalances caused by routing constraints.

Therefore, in this study, we explore relative placement method to implement the symmetric routing in the obfuscated delay-based physical unclonable function on the FPGA board. The delay analysis result proves that our method to implement the symmetric routing was successful. Therefore, this has been achieved good physical unclonable function quality with uniqueness of 48.75%, reliability of 99.99%, and uniformity of 52.5%. Moreover, by using the obfuscation method, which is an arbiter physical unclonable function combined with a random challenge permutation technique, we reduced the vulnerability of arbiter-physical unclonable function against machine learning [24]. There are various existing solutions to classification and one of them is decision tree classification which can achieve high accuracy while handling the large data sets. But decision tree classification is computationally intensive algorithm and as the size of the dataset increases its running time also increases which could be from some hours to days even. But thanks to field programmable gate arrays FPGA which could be used for large datasets to achieve high performance implementation with low energy consumption [25], [26].

## 2.1. Problem statement

In system on a chip based complex processors has the problem of multithreading and miss-functionality due to their complexity and high-speed operations. In order to minimize these problems, the proposed design has machine learning-based algorithms and cryptography systems for security has been incorporated. In the proposed work, the security level has been taken care of in three different stages such as data integrity, data authentication, and private and public keys encryption and decryption. So multi-processors have complete protection for any permissions like data accessibility, write/read operations, and protectability.In this work, an introduction to work done in previously is described in section 1, related works on system on chip (SoC)-based processors, NoC, and machine learning are described in section 2, the main contribution and proposed algorithms are explained in section 3, and the functionality of the proposed work is described in section 4, results and discussions are shown in section 5 and conclusion is in section 6.

# 3. PROPOSED DYNAMICALLY RECONFIGURABLE MULTI-PROCESSOR AND SECURITY USING MACHINE LEARNING AND THEIR OPTIMIZATIONS

The complete proposed architecture is presented in Figure 1 and it has a Zynq software core processor and co-processor as a customized accelerator coupled, the instruction cache is required to enable/disable the Zynq for continuous monitoring and interface with the stream of the processor. The cache is always placed internally to the Zynq and a direct interface is not possible and cannot be probed for monitoring, so the processor can fetch the data directly through instruction from the cache memory and the migration technique function cannot apply and the solution is to store the data in external memory so that the Zynq processor can access the data and it can disable whenever needed.

In the proposed architecture, the customized accelerator can access the external cache memory other than block random access memory (BRAM) which is placed internally. In order to increase accessibility between accelerator and double data rate memory, the high speed and small in size cache internet protocol, is added and it also makes direct connections between them, this will help repeated access of data for specified address between DDR and accelerator. The proposed architecture has additional advanced high-performance (AHB) with counter and timer and it is connected to Zynq and peripherals through an interface to transfer data also AHB interconnects to the interface to host for direct data communications.

The AHB has more advantages in terms of burst length, power savings, and low latency and also supports system-level cache access and unaligned byte strobe and address. Initially, with the help of software, the boot image code is stored static random access memory (SRAM) with the help of direct memory access (DMA), and upon completion of the secure hash algorithm (SHA) operation the software reads SHA, HASH result through APB and copies SHA hash value public key to memory. The encryption keys are stored in plain text form as a one-time program (OTP) and it is controlled by hardware. This OTP is encrypted using advanced encryption standard (AES) and results are stored in flash memory. Using APB protocol, the stored keys are read from flash and exposed to AES and SHA for key exchange and decryption.



Figure 1. Proposed architecture of dynamically reconfigurable multi-processor for machine learning systems

# 4. FUNCTIONALITY OF PROPOSED DYNAMICALLY RECONFIGURABLE MULTI PROCESSOR

The multi-processor core generates data along with the address to be processed through AHB to APB bridge to interface with slower peripherals like I2C and I2S and these two are used specifically for audio signals transfer between the processor and different audio processing algorithms to increase performance with minimal latency. The PAB bridge is scale downs the number of clock cycles during 'setup' and 'access' states in APB to match the speed of audio and interface modules. In order to increase performance in terms of data/packet losses. The bridge stores the packet of 32 bits in transmit FIFO (Tx-FIFO) of a minimum depth of 256 and the stored data can be read at later stages by serial peripherals. The proposed multi-processor is a microprocessor which is a Cortex-based ARM processor and it includes security systems and NoC routers to interface with all peripherals. Each peripheral is interfaced with the processor through AXI interconnect and bridges to increase the throughput and decrease the latency. The serial protocols (Master) can process the data as per protocol standards and convert it into serial bits that are transferred to general purpose input/output (GPIO) through FPGA PMOD connectors to interface external devices. This master sends serial bits along with the serial clock, this clock is utilized by the slave as an input clock to synchronize the serial data bits. The slave receives the bits through the start and stops bits and converts them into parallel data and stored them in the receiver FIFO (RX-FIFO) so that the APB bridge can read that stored data whenever required. The proposed top-level architecture of an machine learning-based SVM accelerator is depicted in Figure 2 has the number of input/output buffers and weights buffers and these buffers are used to provide for buffering efficiently for further processing. In order to minimize the off-chip memory traffic, a specialized 3D network-on-chip (3Dmulti-processor with NoC) is incorporated in the proposed system to re-distribute the output packet through a multi-banked input buffer instead of sending the packet to destination nodes to external memory. The SVM and 3D-multi-processor with NoC operations are independently performed using different processing elements (PE's). In Figure 2, the control module generated all control signals to all other modules and it is responsible for transferring and control of streaming data and delivery to multi-banked input and weight buffers for each of PE's. The I2C slave controller core provides an interface between a microprocessor and an I2C master device. It offers parametrized FIFO depth through parametrized FIFO. The standard APB interface makes it easy for integration in any SoC or peripheral sub-system. Offers multiple status bit flags for easy SW and IP bring-up.



Figure 2. Proposed architecture of dynamically reconfigurable multi-processor for machine learning systems

# 5. RESULTS AND DISCUSSION

The majority of the system on the chip uses the 3D-multi-processor with NoC as the significant communication technique to operate on all the communications done on the chip. Here, this idea of the project is to examine and implement the prototype of an asynchronous multi-processor with NoCs onto the FPGAs. The crucial and tedious job is to implement the circuit of asynchronous multi-processor with NoC on the standard FPGAs. This is the main reason why we find the reset part of FPGAs developed to establish the design flow. The complete project gives a complete and successful effort to create a multi-processor with NoC for an FPGA. The four-phased bundled data handshake protocol is used while implementation. The network adapters and a router are the two main components that are bundled in the multi-processor with NoC design. OPC interface is the technique that connects the cores of the network. Mesh topology is used to form the network connections in the tiny multi-purpose processor that is used to validate the multi-processor with NoC that is been experimented with through results. Network on the chip is one such technology, which is now extending its care all over the world to communicate between the system on chips. This complete research paper concentrates on the ability to model the synchronous multi-processor with NoCs on the FPGAs. The basic trial is to design the synchronous circuit and implement it in the standard FPGSs. This trial includes the designing of the trail flow, testing it, and evaluating its performance on the FPGAs. The best-effort multi-processor with NoC has been eventually developed to fulfill the needs and has been implemented on the FPGAs. The utilization of a multi-processor with NoC comprises system connectors, switch and system connectors, Universal asynchronous receiver (UART) and memory are the main things that are used for the main handshake of the multi-processor with NoC in the 4-stage system. Open core protocol (OCP) interface, is the most utilized protocol of the system. The usage of the working topological model is key to demonstrating the current design in real-time, the designed multiprocessor with NoC experiments on a small multi-purpose processor model. A 3x3 multi-processor with NoC and its UART protocol is been designed and its simulation of the verilog Hybrid deep learning (HDL) code is done and tested on the Artix-7 FPGS kit, the testing processes in done using the ChipScope software tool. To demonstrate the transfer of the data from the source to the destination, here the virtual input/output (VIO) and

integrated controller (ICON) are used. The effectiveness of the data transfer is measured and analyzed with the packet delivery ratio (PDR), latency, and other hardware utilizations like slices, LUT, Flip Flop, and area, the packets are generated using traffic pattern generator. The results can be justified by the following results: improvement of LUTs is about 12%, flip-flops are 7%, improvement of throughput is 23% and delay is reduced by 26%.

## 5.1. Traffic generator design multi-processor with multi-processor with NoC

A direct traffic generator is designed and employed for the testing functions, this corresponding design is equipped with the supply and sink signals of traffic. The traffic supply is in the region of transmission for the predetermined data packets to the fixed data storage. The sink signal of the traffic received with respect to the data received with the potential to incorporate the proper input and output pair. The data packet movement in the network with the multi-processor with NoC design is as shown in Figure 3 each and every entry in the ROM data consists of the data knowledge with the handshake signal deserving the operation of the simplest router operating with the single NOR gate. The asynchronous counter is clocked with an acknowledgment signal designed with the fixed address which is stored with the incremental values in the available data size hence the fixed data memory is fixed with the request signal without any delay. For correct low-level formatting of the fixed storage output, the clock input is gated with the reset signal. The operation of the de-multiplexer for the output channel is done with the flit type control signals and the sink signal of the traffic should access the data packets received for processing is stored in the ROM for the FPGA-based processing for the operation done through the JTAG cable interface for data extraction in the environment of the verification has been using ChipScope pro tool which operates on JTAG interface [26]. The ChipScope tool is GUI based user interactive and easy to debug and verify the design by connecting any of the ports, ChipScope software is used to extract and display the data captured by the cores. The VIO ChipScope core is utilized to collect data for the sink design. The data signal on the request signal's falling edge is captured by the core. The data is sent to the ChipScope program when the internal storage of the VIO core is full. Figure 3 depicts the design. Every signal that needs to be monitored in the design requires the VIO core. It offers cores that can keep track of and archive data traces for any signal in the FPGA throughout operation. The ChipScope software programme is used to extract and core capture the data signal information on the request signal's falling edge.



Figure 3. Block diagram for traffic generator for testing of packets transmitted between processor and peripherals

The design process in the environment of ChipScope VIO core can be done in two various ways with the gathering of the possible information from the programming environment. This environment of VIO cores is executed in the conventional approach, in addition to the programming of the core with the mechanically synthesized netlist and the other way of the programming is the usual verilog HDL technique, but the verilog HDL technique cannot be deployed with the proposed design as verilog HDL type is the fixed programming type and no further dynamic modifications are entertained. Hence by default, the first technique is the best one to gather the information for dynamic programming as the information can be gathered from the synthesizer with the available handshake data path and the handshake signal are not part of the data traffic channel hence it will not affect the netlist synthesizer.

As an outcome, the VI cores are manually configured in the verilog HDL. In the proposed design of the multi-processor with NoC router, an ICON control along with the VIO core is incorporated and the ICON core

will manage the peripheral communication among the JTAG and ChipScope software for the interaction with the optimizing the planning tools in reducing the design complexities. With reference to the clock signal for the operation in accordance with the falling edge of any signal is employed with the request signal with the VIO core and the request signal can be channelized with the netlist of the corresponding clock signal as shown in Figure 4. The designed multi-processor with multi-processor with NoC router is subdivided into various verilog entities namely around nine entities which are done with the structured approach in accordance with the four different directions which are appended in the header as an encoding part of the preamble. The utilization of the area chart with the various parameters is as shown and listed in Table 1, it can be noted interesting factors such as 590 latches and 2,378 LUTs are utilized to meet the matching delay percentage of about less than 29%. For the modifications of the data content of the ROM, the ROM is filled with the various HDL data resources which are achieved in two different ways namely core generator from Xilinx and HDL-based techniques. The programming of ROM is done with the available route lookup table in the master NA.

| A A D D X                      | μοΣ K Ø                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | XKFF                            | 5000 us 🗸 🗵    | C                                      |                  |              | = 0            | Detault Layout |
|--------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------|----------------|----------------------------------------|------------------|--------------|----------------|----------------|
| a Navigator                    | SINULATION - Buhavioral Simul                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | lation - Functional - sim_1 - 1 | OP             |                                        |                  |              |                |                |
| PROJECT MANAGER                | TOP,r x Untitled 3*                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | x IP10.x x IP12.x               | ~              |                                        |                  |              |                | 7 6            |
| O Settings                     | 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |                                 |                |                                        |                  |              |                | 7.0            |
| Add Sources                    | ୁ ଘ 🖬 ଘ ଘ                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 8 <b>- 1</b> - 1 - 2            | 💇 🖣 🕼 🖬 🗄      | ų                                      |                  |              |                |                |
| Language Templates             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                 |                |                                        | 71.              | 10000000 82  |                |                |
| P Catalog                      | 5 Name                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Value                           | 0.00000000 m#  | · · · · · · · · · · · · · · · · · · ·  | .00000000 ms     |              | 100.00000000 . |                |
| <ul> <li>In Catalog</li> </ul> | i di                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | ¢                               |                |                                        |                  |              |                |                |
| PINTEGRATOR                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 0                               |                |                                        |                  |              |                |                |
|                                | > 10 01214 10 <                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 57896                           | (40)(11245)(0) | (HE) ( ANTE )                          | 67974 el         | 0            | 49974          | #8764          |
| Create Block Design            | > W OUT ((19.0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 70000                           | 0 ( 610 )      | 10000                                  |                  |              | 70000          |                |
| Open Block Design              | 2 > W 0UT2[19:0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 70000                           | 30600 100      |                                        |                  | 10060        |                |                |
| Generate Block Design          | Note: No | 710000                          | 100000         | 0 (8 21000) (NO                        | 001 200000 22000 | 10 10 100000 | 7000K          | 19000C         |
|                                | ○ > ₩ 0UT4(19.0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | B0000                           | 40 NI010 (D    | 900 abobt                              | P2010            | x            | 60000          |                |
| BULATION                       | S > # 0012018.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 70000                           | 0 0 000        |                                        | 70000            |              | 1 1            | 0000           |
| Run Simulation                 | 5 > W OUT6[12:0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 70000                           | 0 0410 790     |                                        |                  | 10000        |                | X              |
|                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 72345                           |                | 0 0 0 0 22340 0 223                    |                  | 200 00000    | 77682          | 77692          |
| TL ANALYSIS                    | S > ₩ 0UT8[19:0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | b0000                           | 1              | B (B) A1010                            | F1010            |              | 40000          |                |
| Open Elaborated Design         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | P0000                           | 100000         | <b>B</b> ( <b>8</b> 6143 ) <b>8</b> 30 |                  | 72345        | e0000          |                |
|                                | > W OUTIQITAQ                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 50000                           | 100000         |                                        | 43) 86543 ( 8400 |              | ( 0000 )       | 60000          |
| NTHESIS                        | > 9 00111[19:0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | b8765<br>X0000                  |                | 200000                                 | (De 176)         | <u>1</u>     | 47036          |                |
| Run Synthesis                  | > 10 00/T12(124)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 20000                           | · · · · ·      | X0000K                                 |                  | 5            | <10005         |                |
|                                | > 9 00100100                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 20000                           |                | 20000                                  |                  |              | 1000           |                |
| Open Synthesized Design        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                 |                |                                        |                  |              |                |                |
| PLEMENTATION                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                 |                |                                        |                  |              |                |                |
| Run Implementation             | Tel Console Messager                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Log                             | 2 (            |                                        |                  |              |                |                |

Figure 4. Simulated results of proposed multi-processor with 3\*3 NoC for high speed and low power environment

Table 1. Comparison between proposed 3\*3 synchronous network on chip (multi-processor with NoC) and<br/>existing 3\*3 synchronous network on chip (multi-processor with NoC)

| existing 5 5 synemonous network on emp (multi-processor with Noc) |               |                       |                       |  |  |  |  |  |
|-------------------------------------------------------------------|---------------|-----------------------|-----------------------|--|--|--|--|--|
| Parameter                                                         | Existing [17] | Existing [22]         | Proposed              |  |  |  |  |  |
| Slice register                                                    | 27,925        | 500/packet            | 2,306                 |  |  |  |  |  |
| Slice LUT's                                                       | 24,980        | 5,150                 | 3,201                 |  |  |  |  |  |
| LUT flip-flop pairs used                                          | 9,830         | 542                   | 542                   |  |  |  |  |  |
| Delay                                                             | 62 µs         | 49 µs                 | 2.91 ns               |  |  |  |  |  |
| Power                                                             | 1,146 mW      | 1.15 Watts            | 84 mWatts             |  |  |  |  |  |
| Frequency                                                         | 571 MHz       | 594.229 MHz           | 594.229 MHz           |  |  |  |  |  |
| Throughput                                                        | 2.1 GB/sz     | 235.20 MB/s           | 6.37 GB/s             |  |  |  |  |  |
| Throughput                                                        | 24.2 ns       | 2.91*0.71 ns=2.006 ns | 2.91*0.71 ns=2.006 ns |  |  |  |  |  |

The prime goal of the present implementation scenario is to demonstrate the multi-processor with NoC in the asynchronous mode of operation in the customized FPGA. With reference to the existing work in accordance with the literature review on the FPGA implemented with asynchronous circuits which shows the exposure with the all-possible synchronous mode; hence this outcome of the project has been implemented with the usual flow of general design on the asynchronous FPGA operations. The designed multi-processor with NoC is done on the asynchronous operation which it has done on the best-effort service. The implemented multi-processor with NoC is incorporated with the pair of NA operating in the master-slave mode along with

a router. The routing operation is designed the satisfying the routing and topology with wormhole and mesh respectively. The issue of the deadlock in the multi-processor with NoC is resolved with the two-dimensional XY routing in combination with the supply routing implemented. In the desired multi-processor with NoC operating at the packet-based switching for the data moment, the packet to be identified with an infinite number to initiate the start and stop of the data packets is done with the data packet encoded with the additional handshake signals for the identification. The OCP interface has been contributed with the NAS for the connecting core of the network, and the operation of synchronization is done with the combination of the bi-IP-op synchronizer. The designed router is associated with the 542 latches for the various router components causing the delay and the miniature kind of multi-processor kind of prototypes developed in the asynchronous multi-processor with NoC. The designed prototype is comprised of three different CPUs with the corresponding peripheral units which are associated with the 3X2.

To make the designed multiprocessor to resistant to the deadlocks based on the data messages a dedicated path for the reply and request mode is employed for the infinite avoidance of the deadlocks. This criterion may not satisfy the higher strategic delays of the FPGA deadlock but it has resulted in the efficient utilization of the smaller logical resources on the FPGA, it is also expected to have the meaningful utilization of the existing resources of the available FPGA for successful avoidance of the deadlocks. It is still in the dark that for the successful implementation of the asynchronous multi-processor with NoC to resolve all the issues of the asynchronous kind of multi-processor with NoC due to the prototyping technique of the FPGA, including the delay matching criteria of FPGA but the design complications with lack of supportive tools are also in big trouble to provide the successful operations. For the minimization of the delay in the FPGA circuits, it is equipped with a separate data path for data exchange. The achieving relatively less delay to satisfy the design parameters of the multi-processor with NoC with the scaling down of the data packet path with the macros with primitives of design of multiprocessor with multi-processor with NoC in spite of the existing issues of deadlock avoidance tools as a matter of real-time fact of multi-processor with NoC, it is essential for the matching of the delay in the design, it is highly impossible to meet the operations of the logical optimizations which are implemented with the available design tools. Whereas the operation is quite optimized with the various settings of the design constraints. For the thorough design of the multi-processor with NoC circuits to meet the various LUT mapping which would be an alternative for all the issues pertaining to the multi-processor with NoC design. Channel routing algorithm: this kind of algorithm is deployed for the assessment of the ideal path among the sub-modules which are interconnected in an architectural manner in terms of area and it is proved from the results that it resulted in superior efficiency and stability in the network congestion of NoC. Afew screenshots of the preferred algorithm are shown in Figure 5 for the justification of the proposed work.

| Name                                       | Value               |            | 2,640                  | <sup>ns</sup> | 2,650 ms    | P        | 660 ms      | 2.67      | 0 ns             |
|--------------------------------------------|---------------------|------------|------------------------|---------------|-------------|----------|-------------|-----------|------------------|
| v74[47:0]                                  | *****               |            | 3000                   | 0000000       | xx          |          | efaccbadad  |           |                  |
| v75[47:0]                                  | *****               |            | 2000                   |               | xx          | k        | efactbadad  | ka 🖉      |                  |
| 18 OK                                      | 1                   |            |                        |               |             |          |             |           |                  |
| la est                                     | 0                   |            |                        |               |             |          |             |           |                  |
| In[47:0]                                   | efacebadadca        |            |                        |               |             |          |             | efactbad  | adca             |
| 1a enable0                                 | 2                   |            |                        |               |             |          |             |           |                  |
| 1 enable1                                  | •                   |            |                        |               |             |          |             |           |                  |
| 1a enable2                                 | 0                   |            |                        |               |             |          |             |           |                  |
| lie enable3                                | 0                   |            |                        |               |             |          |             |           |                  |
| 15 enable4                                 | 0                   |            |                        |               |             |          |             |           |                  |
| 1 enable5                                  | •                   |            |                        |               |             |          |             |           |                  |
| l enable6                                  | •                   | _          |                        |               |             |          |             |           |                  |
| l enable7                                  | •                   |            |                        |               |             |          |             |           |                  |
| Cnable8                                    | 0                   |            |                        |               |             |          |             |           |                  |
| la cnable?                                 | 1                   |            |                        |               |             |          |             |           |                  |
| enable10                                   | •                   |            |                        |               |             |          |             |           |                  |
| enable11                                   | •                   |            |                        | -             |             |          |             |           |                  |
|                                            |                     |            |                        |               |             |          |             |           |                  |
| 1,500 ns 2,000                             |                     |            | 3,000 ns               |               | ,500 ns     | 4,000    |             | 4,500 ns  | <u> </u>         |
| fact                                       |                     | efaccbadad |                        |               | faccbadadca |          | efacche     |           | X00 X            |
| rfacebadadca X00X efa<br>efacebadadca X00X | efaccbadadca        | X00 X      | badadca<br>efaccbadado | ו••           | efaccbada   | condadea | 00          | faccbadad | ca Xu<br>badadca |
| x efactbadadca x0                          |                     |            |                        |               | X00X        | efacche  |             | 00 X      | efaccbad         |
| CO X efactbadadca                          |                     | badadca    | X00 X                  | efaccha       |             |          | facchadadca |           |                  |
| Add the second second                      | A CONTRACT CONTRACT |            | A COMPANY              | ensected      | dadta       |          |             |           |                  |
|                                            |                     |            |                        |               |             |          |             |           |                  |
|                                            |                     |            |                        |               |             |          |             |           |                  |
|                                            |                     |            | efac                   | badadte       | •           |          |             |           |                  |
|                                            |                     |            | efac                   | badadp        |             |          |             |           |                  |
|                                            |                     |            | efac                   | badadp        |             |          |             |           |                  |
|                                            |                     |            | efac                   | badadc        | •           |          |             |           |                  |
|                                            |                     |            | eface                  | badada        |             |          |             |           |                  |

Figure 5. Simulation of the result after data routing onto the channel

Optimization of re-configurable multi-core processors ... (Prashant Bachanna)

# 6. CONCLUSION

The machine learning-based SVM is state-of-art and its evaluation has been done for different packets received from multi-processor to different peripherals and classifies the type of packet received with higher accuracy and low error by compromising the complexity of computation. With the help of a multi-processor and accelerator to perform machine laerning-based SVM is designed and developed through the high-level synthesis in vivado design suite 2018. 1 tool and their performance in terms of latency, throughput, and power consumption is evaluated and compared with state-of-art. The obtained results proved that HLS-based implementation is better in performance 23.4x faster than a GPP-based design and 12.6x faster than GPU. The proposed multiprocessor along with different high-speed protocols are integrated with block-level hardware which contains a Zynq processor, AXI interconnects are targeted to the Zynq-7,000 development FPGA board and finally interfaced with the software development kit (SDK) environment, and through software programming, the entire design and functionality are tested and verified. The results attained from the proposed design show that the generated custom loop accelerator could be used to compute complex machine learning classifiers with an increasingly large amount of data. The proposed design also ability to scale architecture to other PE's which are part of 3D-NoC. The suggested hybrid multi-processor with NoC that is based on the applicationaware design for big data loads, as well as transmission, is extremely useful in increasing service quality as well as energy efficiency over significant heterogeneous application loading. To assist in the selection of an efficient multi-processor with NoC, a hybrid multi-processor with NoC consisting of a specific multi-processor with NoC as well as a buffered multi-processor with NoC, and an application-aware technique is being presented in this research work. The proposed methodology enhances the performance of the system significantly. Furthermore, we developed a unique hybrid multi-processor with an NoC congestion optimization approach. This approach is accomplished by redistributing the packets in congested nodes as well as verifying the performance of various multi-processor with NoCs' congestion. The overall system's energy efficiency may be significantly enhanced by employing these two proposed strategies. A unique buffer-less routing algorithm is employed with any topology in this work. The recommended routing technique mainly depends on the principle of making-a-stop (MaS) intending to achieve deadlock as well as livelock freedom in wormhole-switched multi-processor with NoC. With the help of a flit-level, cycle-accurate network simulator, the performance of synthetic traffic situations can be assessed. When compared to another conventional bufferless routing algorithm, the computational results demonstrate that the designed routing algorithm optimizes average latency by 24%, power consumption by 19%, as well as area overhead by 47%.

# REFERENCES

- T. Moscibroda and O. Mutlu, "A case for bufferless routing in on-chip networks," ACM SIGARCH Computer Architecture News, vol. 37, no. 3, pp. 196–207, 2009, doi: 10.1145/1555815.1555781.
- J. Fang, S. Liu, S. Liu, Y. Cheng, and L. Yu, "Hybrid network-on-chip: an application-aware framework for big data," *Complexity*, vol. 2018, pp. 1–11, Jul. 2018, doi: 10.1155/2018/1040869.
- [3] Y. Zhang et al., "EGraph: efficient concurrent GPU-based dynamic graph processing," IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 6, pp. 5823–5836, 2023, doi: 10.1109/TKDE.2022.3171588.
- [4] A. Yoosefi and H. R. Naji, "A clustering algorithm for communication-aware scheduling of task graphs on multi-core reconfigurable systems," *IEEE Transactions on Parallel and Distributed Systems*, vol. 28, no. 10, pp. 2718–2732, Oct. 2017, doi: 10.1109/TPDS.2017.2703123.
- [5] L. Jing, L. Xiaola, and T. Liang, "Making-a-stop: a new bufferless routing algorithm for on-chip network," Journal of Parallel and Distributed Computing, vol. 72, no. 4, pp. 515–524, 2012.
- [6] R. G. Kunthara, R. K. James, S. Z. Sleeba, and J. Jose, "ReDC: reduced deflection CHIPPER router for bufferless NoCs," Proceedings of the 2018 8th International Symposium on Embedded Computing and System Design, ISED 2018, pp. 204–209, 2018, doi: 10.1109/ISED.2018.8704012.
- [7] J. Rahul and M. Sora, "Premature ventricular contractions classification using machine learning approach," in Proceedings - International Conference on Smart Electronics and Communication, ICOSEC 2020, Sep. 2020, pp. 367–370, doi: 10.1109/ICOSEC49089.2020.9215290.
- [8] A. Shpiner, E. Kantor, P. Li, I. Cidon, and I. Keslassy, "On the capacity of bufferless networks-on-chip," IEEE Transactions on Parallel and Distributed Systems, vol. 26, no. 2, pp. 492–506, 2015, doi: 10.1109/TPDS.2014.2310226.
- [9] Kao, "Design of a bufferless photonic clos network-on-chip architecture," Digital Object Indentifier, 2012.
- [10] C. Feng, Z. Lu, A. Jantsch, M. Zhang, and Z. Xing, "Addressing transient and permanent faults in NoC with efficient fault-tolerant deflection router," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 21, no. 6, pp. 1053–1066, 2013, doi: 10.1109/TVLSI.2012.2204909.
- [11] D. Tomaso, A. K. Kodi, A. Louri, and R. Bunescu, "Resilient and power-efficient multi-function channel buffers in network-on-chip architectures," *IEEE Transactions on Computers*, vol. 64, no. 12, pp. 3555–3568, 2015, doi: 10.1109/TC.2015.2401013.
- [12] A. T. Tran and B. M. Baas, "Achieving high-performance on-chip networks with shared-buffer routers," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 22, no. 6, pp. 1391–1403, Jun. 2014, doi: 10.1109/TVLSI.2013.2268548.

- [13] C. H. Chao, K. C. Chen, and A. Y. Wu, "Routing-based traffic migration and buffer allocation schemes for 3-D network-on-chip systems with thermal limit," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 21, no. 11, pp. 2118–2131, 2013, doi: 10.1109/TVLSI.2012.2227852.
- [14] M. Arjomand and H. Sarbazi-Azad, "Power-performance analysis of networks-on-chip with arbitrary buffer allocation schemes," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 29, no. 10, pp. 1558–1571, Oct. 2010, doi: 10.1109/TCAD.2010.2061171.
- [15] C. Li, D. Dong, Z. Lu, and X. Liao, "RoB-router: a reorder buffer enabled low latency network-on-chip router," *IEEE Transactions on Parallel and Distributed Systems*, vol. 29, no. 9, pp. 2090–2104, Sep. 2018, doi: 10.1109/TPDS.2018.2817552.
- [16] H. M. Hussain, K. Benkrid, and H. Seker, "Dynamic partial reconfiguration implementation of the SVM/KNN multi-classifier on FPGA for bioinformatics application," *Proceedings of the Annual International Conference of the IEEE Engineering in Medicine* and Biology Society, EMBS, vol. 2015-November, pp. 7667–7670, 2015, doi: 10.1109/EMBC.2015.7320168.
- [17] H. Irmak, F. Corradi, P. Detterer, N. Alachiotis, and D. Ziener, "A dynamic reconfigurable architecture for hybrid spiking and convolutional fpga-based neural network designs," *Journal of Low Power Electronics and Applications*, vol. 11, no. 3, 2021, doi: 10.3390/jlpea11030032.
- [18] K. Vipin and S. A. Fahmy, "FPGA dynamic and partial reconfiguration: a survey of architectures, methods, and applications," ACM Computing Surveys, vol. 51, no. 4, 2018, doi: 10.1145/3193827.
- [19] P. Bachanna and B. Gadgay, "Design and implementation of FPGA based dynamically re-configurable processor for machine learning systems," 2021 IEEE Mysore Sub Section International Conference, MysuruCon 2021, pp. 137–141, 2021, doi: 10.1109/MysuruCon52639.2021.9641553.
- [20] K. M. V. Gowda, S. Madhavan, S. Rinaldi, P. B. Divakarachari, and A. Atmakur, "FPGA-based reconfigurable convolutional neural network accelerator using sparse and convolutional optimization," *Electronics (Switzerland)*, vol. 11, no. 10, p. 1653, May 2022, doi: 10.3390/electronics11101653.
- [21] A. Shawahna, S. M. Sait, and A. El-Maleh, "FPGA-based accelerators of deep learning networks for learning and classification: a review," IEEE Access, vol. 7, pp. 7823–7859, 2019, doi: 10.1109/ACCESS.2018.2890150.
- [22] Z. Fan, J. Yang, J. Han, X. Zeng, and X. Cheng, "A mesh-based self-adaptive NoC with low-latency reconfigurable ring clusters," in 2020 IEEE 15th International Conference on Solid-State & Integrated Circuit Technology (ICSICT), Nov. 2020, pp. 1–3, doi: 10.1109/ICSICT49897.2020.9278309.
- [23] N. Ramalingam and A. Thiagarajan, "FPGA-based fault analysis for 7-level switched ladder multi-level inverter using decision tree algorithm," *International Journal of Reconfigurable and Embedded Systems*, vol. 12, no. 2, pp. 157–164, 2023, doi: 10.11591/ijres.v12.i2.pp157-164.
- [24] M. H. Ishak, M. S. Mispan, W. Y. Chiew, M. R. Kamaruddin, and M. A. Korobkov, "Secure lightweight obfuscated delay-based physical unclonable function design on FPGA," *Bulletin of Electrical Engineering and Informatics (BEEI)*, vol. 11, no. 2, pp. 1075–1083, 2022, doi: 10.11591/eei.v11i2.3265.
- [25] K. Malhotra and A. P. Singh, "Implementation of decision tree algorithm on FPGA devices," IAES International Journal of Artificial Intelligence (IJ-AI), vol. 10, no. 1, p. 131, Mar. 2021, doi: 10.11591/ijai.v10.i1.pp131-138.
- [26] C. Trabelsi, S. Meftali, and J.-L. Dekeyser, "Decentralized control for dynamically reconfigurable FPGA systems," *Microprocessors and Microsystems*, vol. 37, no. 8, pp. 871–884, Nov. 2013, doi: 10.1016/j.micpro.2013.04.012.

#### **BIOGRAPHIES OF AUTHORS**



**Dr. Prashant Bachanna** <sup>(D)</sup> **St** <sup>(D)</sup> received a B.E. degree in Electronics and Communication Engineering from Visvesvaraya Technological University, Belgavi, Karnataka, India in 2008. He received his M.Tech. degree in VLSI System Design from Visvesvaraya Technological University, Belgavi, Karnataka, India in 2012. He has completed Ph.D. degree from Visvesvaraya Technological University, Belgavi, Karnataka. Currently, working as an Assistant Professor in the Department of Electronics and Communication Engineering in Institute of Aeronautical Engineering (IARE), Hyderabad, Telangana. His areas of interest are VLSI design and embedded system. He can be contacted at email: b.prashant@iare.ac.in.



**Dr. Palla Hari Sankar (D) [3] [2]** received his B.Tech. in Mechanical Engineering from Jawaharlal Nehru Technological University, Hyderabad, Andhra Pradesh and M.E. in Machine Design from University of Roorkee, Roorkee, Uttar Pradesh. He has completed his Ph.D. degree from Jawaharlal Nehru Technological University Anantapur, Anantapur, Andhra Pradesh in 2017. He has published more than 20 papers in international Journals and conferences. He has 23 years of teaching experience. His research interest includes PMC, tripology, machine design, and machine learning. He is a life member of Indian society for technical education. He can be contacted at email: pallahharisankar1@gmail.com.



**Dr. Mukesh Kumar Tripathi (D) X Z** received a Ph.D. in computer science and engineering from Visvesvaraya Technological University (VTU), Belagavi. He also received a B.E. in information technology from Guru Nanak Dev Engineering College, Bidar, India and M.Tech. computer science and engineering from SCET, JNTUH University in 2013. He has 13 years of teaching and administrative experience. He has supervised and co-supervised more than five masters and 20 B.E. students. He has authored or co-authored more than ten publications and over 232 citations. He is working as an assistant professor with the Department of Computer Science and Engineering, Vardhaman College of Engineering, Hyderabad, India. His research interests include soft computing, machine learning, intelligent systems, image processing, and hyperspectral. He can be contacted at email: mukeshtripathi016@gmail.com.



**Dr. Shivendra (b) K (a) (c)** received the Ph.D. degree in Electronics and Communication Engineering from JJT University, Rajasthan, India, in 2023. He received his M.Tech degree in Electronics and communication from Thrivalluavar Univesity Velloare Tamilnadu, India, in 2015. He received his B.E. degree in Electronics and communication from GNDEC, Bidar, India in 2011. He is working as Assistant professor in Department of BCA in D.K College, Dumraon, Buxar. he has supervised and co-supervised more than 20 student. He is authored or co-authored more than 5 publications. His research interests include soft computing, machine learning, and intelligent systems, image processing, Hyper spectral and data science. He can be contacted at email:srivastavashivendra29@gmail.com.



**Dr. Kadali Ravi Kumar b X C** received his B.Tech. in Electrical and Electronics Engineering and M.Tech. in Power Electronics from JNTU, Kukatpally, Hyderabad in 1998 and 2005 respectively. He has completed his Ph.D. degree from National Institute of Technology, Warangal, India. He has published more than 15 papers in international and national journals and conferences. He has completed Two AICTE funded workshops worth 10 lakhs. He is working as professor in Vasavi college of Engineering, Hyderabad. He has 25 years of teaching experience and 16 years of R&D experience. His research interest includes artificial intelligence techniques applied to power system optimization, renewable energy systems, and power quality issues. He can be contacted at email: drkadaliravikumar@gmail.com.



**Dr. Nilesh Bhosle D M C** received his B.E. in Electronics Engineering and M.Tech. in Electronics Engineering from Shri Guru Gobind Singhji Institute of Engineering and Technology Nanded, Maharashtra. He has completed his Ph.D. degree from Swami Ramanand Teerth Marathwada University Nanded. He has published more than 15 papers in international and national journals and conferences and published 10 patents. Currently he is working as an Associate Professor in Department of Information Technology at Trinity Academy of Engineering, Pune. He has 16 years of teaching experience and 2 years of industry experience. His research interest includes machine learning, VLSI design, image processing, content-based image retrieval, and computer vision. He can be contacted at email: bhoslenp@gmail.com.