How to Run AI on a Microcontroller or Microprocessor – A Beginner’s Guide to Edge AI

_ July 24, 2025_ Mridul Bajaj_ 0 Comments

How to Run AI on a Microcontroller or Microprocessor – A Beginner’s Guide to Edge AI

In the last few years, Edge AI has rapidly transformed how devices understand and interact with the world — without needing constant cloud access. From detecting voice commands on smartwatches to classifying images on drones, AI at the edge is revolutionizing embedded systems.

But how exactly do you run AI on a microcontroller (MCU) or microprocessor (MPU)? As a beginner, this can seem overwhelming. Don’t worry — this blog breaks it down into simple, digestible steps.

What is Edge AI?

Edge AI means running artificial intelligence algorithms locally on a device like an MCU or MPU, rather than sending data to a remote cloud server. This enables:

Faster responses (no internet delay)
Lower power consumption
Increased privacy (data stays on device)

Common applications include:

Wake-word detection (“Hey Alexa”)
Image classification (e.g., person vs object)
Anomaly detection in machines
Gesture recognition, etc.

Why Edge AI Matters ?

Imagine a construction site. Dust, noise, humans in helmets — and a camera watching for safety violations.
If that video had to go to the cloud for AI processing, it would take time, cost money, and raise privacy concerns.

So instead, we ask:

“Can we do the thinking right there, on the device?”
That’s Edge AI.

The i.MX93, with its efficient NPU, was made for exactly this.

Microcontroller vs Microprocessor: Which Brain to Choose?

Before you start building, you need to pick the right kind of hardware. Here’s how I learned the difference:

I started with microcontrollers (MCUs) — they’re simple, low-power, and perfect for turning on LEDs or reading sensor data. But when I tried to run an image classifier? Not a chance.

That’s when I moved to microprocessors (MPUs) like the i.MX93, which brought the power of Linux, better memory, and even a dedicated Neural Processing Unit (NPU).

Feature	MCU (e.g. Cortex-M)	MPU (e.g. Cortex-A / i.MX93)
Processing Power	Low (MHz range)	High (GHz range)
Memory	Limited (KBs)	Larger (MBs or GBs)
OS Support	Bare metal / RTOS	Full Linux support
AI Support	TFLite Micro	TFLite, ONNX, NPU via eIQ SDK
Use Case	Low-power IoT, sensors	Edge AI, vision, industrial control

How to Deploy a Machine Learning Model on Any MPU (Generalized Method)

Deploying a machine learning (ML) model on a Microprocessor Unit (MPU) — such as Raspberry Pi, NXP i.MX93, or NVIDIA Jetson — typically involves four key stages:

1. Train the Model

Use standard ML frameworks like:
- TensorFlow / Keras
- PyTorch
- Scikit-learn (for simpler models)
Train on a PC or cloud platform (e.g., Colab, AWS, Azure)
Export the model in a format such as:
- .h5, .pb (TensorFlow)
- .pt, .onnx (PyTorch)
- .tflite (Lite model for edge)

This step is compute-heavy — never done on the MPU.

2. Optimize & Convert the Model

MPUs are resource-constrained (limited RAM/compute), so you must optimize the model:
- Convert to TensorFlow Lite, ONNX, or custom quantized formats
- Reduce model size via:
  - Quantization (e.g., FP32 → INT8)
  - Pruning (removing unnecessary weights)
  - Knowledge distillation (if needed)

More deeper for Step 3 and Step 4:

https://chatgpt.com/share/688272e8-7a98-8000-8994-2bbba2f5fd74

3. Deploy the Model to the MPU

Copy the model to the MPU using:
- SD card
- USB
- SSH / SCP
Install necessary runtime libraries (e.g., TensorFlow Lite runtime, ONNX Runtime, OpenCV)
Configure OS packages or GPIO (e.g., access camera, relays, buzzer)

4. Run Inference on the MPU

Write an inference script (Python, C++, etc.)
- Load the model
- Capture input (e.g., camera, sensor)
- Preprocess the data
- Run inference
- Postprocess and act on output (e.g., turn on buzzer, display alert)

Congratulations — your model is now running offline, on edge, with real-time inference.

Now How to Deploy a ML Model on NXP i.MX93 MPU

The i.MX93 is an edge AI-enabled MPU from NXP with:

Arm Cortex-A55 (Dual-core)
- Runs Linux (Yocto) OS
- Handles AI workloads, networking, file systems
Arm Ethos-U65 NPU
- Specialized Neural Processing Unit
- Accelerates INT8 AI inference at the edge
Arm Cortex-M33 MCU
- Dedicated real-time controller
- Runs bare-metal or RTOS (FreeRTOS, Zephyr)
- Handles deterministic tasks like sensor sampling, GPIO control, motor control, or low-latency signal processing

Supports TensorFlow Lite, ONNX, and eIQ Toolkit
Runs Yocto Linux (custom embedded OS)

There are two primary deployment methods:

Method 1: Using TensorFlow → TensorFlow Lite

This method is framework-agnostic and lets you use open-source tools.

Steps:

Train the model on PC (e.g., detect helmet):
- Use TensorFlow/Keras
- Save model in .h5 or .pb format
Convert to TensorFlow Lite:
- Use TFLiteConverter to convert to .tflite
- Apply post-training quantization to reduce size:
  
  python
  
  converter.optimizations = [tf.lite.Optimize.DEFAULT] tflite_model = converter.convert()
Transfer to i.MX93:
- Via SD card or SCP
- Place .tflite model in a known directory
Write inference script on i.MX93:
- Use tflite-runtime or TensorFlow Lite Interpreter
- Process camera frames
- Run inference
- Control GPIOs (like buzzers) via Linux
Install Dependencies:
- Manually install:
  - tflite-runtime
  - opencv-python
  - numpy

Pros and Cons

Pros	Cons
Open-source & flexible	Requires manual setup
Large community support	Performance may be lower
Framework-agnostic	Less optimized for NXP hardware

Method 2: Using NXP’s eIQ Toolkit (Recommended for i.MX)

This is NXP’s official AI deployment framework for i.MX devices. It simplifies optimization and deployment by leveraging hardware-aware tools.

Steps:

Train your model:
- Model was trained using NXP’s eIQ Toolkit, which provides:
  - GUI- and CLI-based model training and management
  - Support for importing custom datasets
  - Built-in training pipelines using TensorFlow backend
- You can also use:
  - Pre-trained models
  - Import models in formats like .tflite, .onnx, or .pb directly into eIQ
Import into eIQ Toolkit (GUI or CLI):

Once training is complete, optimize the model specifically for the Ethos-U65 NPU in i.MX93:
- Apply INT8 quantization (reduces model size and speeds up inference)
- Perform compiler optimizations for the NPU
- Run model validation within eIQ to verify accuracy and compatibility
Deploy to i.MX93:

Once your model is optimized using eIQ Toolkit, you have two deployment options:

🔹 Method 1: Direct Deployment via eIQ Toolkit
- Use NXP’s Yocto Linux image with eIQ SDK pre-installed
- Transfer the optimized .eiq model file to the i.MX93 using:
  - SCP (Secure Copy Protocol over SSH), or
  - USB connection / Ethernet, depending on setup
- This is a faster and more integrated method, requiring no intermediate storage like SD cards
🔹 Method 2: TFLite Export & SD Card Transfer
- Export your trained model in TensorFlow Lite (.tflite) format
- Store the .tflite file onto an SD card
- Insert the SD card into the i.MX93’s SD card reader
- From there, you can load and run the model using TensorFlow Lite interpreter or convert it inside the device
- Capture real-time images from a USB or MIPI camera
- Run inference on the Ethos-U65 NPU
- Trigger GPIO outputs (buzzer, LEDs, etc.) using built-in libraries
  
  Run Inference with eIQ Runtime:
  
  Use eIQ Runtime APIs (C++ or Python) on the i.MX93 to:

This gives flexibility based on development preference:

Use Method 1 for seamless integration with the eIQ flow
Use Method 2 if you’re working in a TensorFlow-first environment or prototyping offline

Pros and Cons

Pros	Cons
Highly optimized for i.MX93	Limited to NXP boards
Easy deployment pipeline	May lock you into eIQ ecosystem
GUI tools for optimization	Smaller community than TensorFlow

Summary Table: TF Lite vs. eIQ Toolkit on i.MX93

Feature	TensorFlow → TFLite	NXP eIQ Toolkit
Model Format	`.tflite`	`.eiq`
Optimization	Manual (quantization/pruning)	Auto, HW-aware
GPIO & Camera	Manual via Linux APIs	Integrated in eIQ runtime
Hardware Acceleration	Limited (CPU)	Uses Ethos-U65 NPU
Difficulty	Medium to High	Beginner-friendly
Best For	Flexibility, custom code	Fast deployment on NXP MPUs

If you’re deploying specifically on NXP i.MX93, using eIQ Toolkit is the faster, more efficient, and production-ready option — especially when targeting real-time inference on low power.

Author

Single Blog

How to Run AI on a Microcontroller or Microprocessor – A Beginner’s Guide to Edge AI

What is Edge AI?

Why Edge AI Matters ?

Microcontroller vs Microprocessor: Which Brain to Choose?

How to Deploy a Machine Learning Model on Any MPU (Generalized Method)

1. Train the Model

2. Optimize & Convert the Model

3. Deploy the Model to the MPU

4. Run Inference on the MPU

Now How to Deploy a ML Model on NXP i.MX93 MPU

Method 1: Using TensorFlow → TensorFlow Lite

Steps:

Pros and Cons

Method 2: Using NXP’s eIQ Toolkit (Recommended for i.MX)

Steps:

🔹 Method 1: Direct Deployment via eIQ Toolkit

🔹 Method 2: TFLite Export & SD Card Transfer

Pros and Cons

Summary Table: TF Lite vs. eIQ Toolkit on i.MX93

Mridul Bajaj

Leave a comment Cancel reply

Our Address

Our Mailbox

Our Phone

Table of Contents

Single Blog

What is Edge AI?

Why Edge AI Matters ?

Microcontroller vs Microprocessor: Which Brain to Choose?

How to Deploy a Machine Learning Model on Any MPU (Generalized Method)

1. Train the Model

2. Optimize & Convert the Model

3. Deploy the Model to the MPU

4. Run Inference on the MPU

Now How to Deploy a ML Model on NXP i.MX93 MPU

Method 1: Using TensorFlow → TensorFlow Lite

Steps:

Pros and Cons

Method 2: Using NXP’s eIQ Toolkit (Recommended for i.MX)

Steps:

🔹 Method 1: Direct Deployment via eIQ Toolkit

🔹 Method 2: TFLite Export & SD Card Transfer

Pros and Cons

Summary Table: TF Lite vs. eIQ Toolkit on i.MX93

Mridul Bajaj

Edge AI ToolKit: EiQ AI SDK

FRDM i.MX93 Demystified: The Hardware That Makes AI Work at the Edge

Leave a comment Cancel reply

Our Address

Our Mailbox

Our Phone

Table of Contents