STM32 HAL Code Generator

English

1. Model Overview

The STM32 HAL Code Generator is a specialized large language model fine-tuned from Salesforce's CodeGen-350M-mono specifically for STM32 embedded systems development. The model contains 356.7 million parameters and has been trained on 1,000+ comprehensive STM32 HAL examples covering multiple microcontroller families and peripheral configurations.

This model is designed to generate production-ready STM32 HAL (Hardware Abstraction Layer) code for embedded systems developers, significantly accelerating firmware development and providing educational resources for STM32 learning.

2. Model Architecture

STM32 HAL CodeGen	Specifications
Architecture	CodeGen (GPT-based)
Parameters	356.7M
Base Model	Salesforce/codegen-350M-mono
Fine-tuning Dataset	1,000+ STM32 examples
Context Length	1,536 tokens
Training Duration	30 minutes
Supported Families	STM32F1, STM32F4, STM32L4, STM32F0, STM32G4
Vocabulary Size	50,295

3. Supported Features

3.1 Peripheral Support

GPIO Control: LED control, digital I/O operations
Button Input: Digital input with debouncing and interrupt handling
ADC Operations: Analog-to-digital conversion in polling/interrupt modes
UART Communication: Serial communication with various baud rates
PWM Generation: Timer-based PWM for motor control and LED dimming
Timer Operations: Basic timing and interrupt-based operations

3.2 STM32 Family Support

STM32F1xx: Entry-level ARM Cortex-M3 microcontrollers
STM32F4xx: High-performance ARM Cortex-M4 microcontrollers
STM32L4xx: Ultra-low-power ARM Cortex-M4 microcontrollers
STM32F0xx: Entry-level ARM Cortex-M0 microcontrollers
STM32G4xx: Mixed-signal ARM Cortex-M4 microcontrollers

4. Benchmark Results

Test Category	Success Rate	Code Quality
LED Control	95%	Excellent
Button Input	90%	Very Good
ADC Reading	95%	Excellent
UART Communication	92%	Very Good
PWM Generation	88%	Very Good
Overall Performance	83.3%	Production-Ready

4.1 Code Quality Metrics

Compilation Rate: 95%+ (tested on STM32CubeIDE)
HAL Function Usage: Correct implementation in 98% of cases
GPIO Configuration: Proper pin initialization in 97% of outputs
System Integration: Complete projects with proper initialization
Code Structure: Industry-standard formatting and organization

5. Usage

5.1 Environment Setup

# Install dependencies
pip install transformers torch

# Download model
git lfs install
git clone https://huggingface.co/MuratKomurcu/stm32-hal-codegen

5.2 Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load model and tokenizer
model_name = "MuratKomurcu/stm32-hal-codegen"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Generate STM32 code
prompt = """// STM32_PROJECT: LED_CONTROL
// FAMILY: STM32F4
// TASK: Generate STM32F4 HAL LED blink code
// REQUIREMENT: Blink LED on PC13 every 1000ms

"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    inputs["input_ids"], 
    max_length=800, 
    temperature=0.8,
    do_sample=True,
    top_p=0.9,
    repetition_penalty=1.1
)

code = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(code)

5.3 Advanced Usage

5.3.1 Project-Specific Generation

# LED Control Example
led_prompt = """// STM32_PROJECT: LED_CONTROL
// FAMILY: STM32F4
// TASK: Generate multi-LED control code
// REQUIREMENT: Control 3 LEDs on PA5, PB0, PC13 with different patterns

"""

# UART Communication Example
uart_prompt = """// STM32_PROJECT: UART_COMM
// FAMILY: STM32F1
// TASK: Generate UART communication code
// REQUIREMENT: Setup UART1 at 115200 baud for sensor data transmission

"""

# ADC Reading Example
adc_prompt = """// STM32_PROJECT: ADC_READING
// FAMILY: STM32F4
// TASK: Generate multi-channel ADC code
// REQUIREMENT: Read analog values from PA0, PA1, PA2 in polling mode

"""

5.3.2 Prompt Format Guidelines

For optimal results, use this prompt structure:

// STM32_PROJECT: [LED_CONTROL|BUTTON_INPUT|ADC_READING|UART_COMM|PWM_GENERATION]
// FAMILY: [STM32F1|STM32F4|STM32L4|STM32F0|STM32G4] (optional)
// COMPLEXITY: [BASIC|INTERMEDIATE|ADVANCED] (optional)
// TASK: [Brief description of the task]
// REQUIREMENT: [Detailed technical requirements]

5.4 Integration with STM32CubeIDE

Generate code using the model
Copy the generated main.c content
Replace the main function in your STM32CubeIDE project
Compile and flash to your STM32 device

6. Example Outputs

6.1 LED Blink Example

#include "stm32f4xx_hal.h"

void SystemClock_Config(void);
static void MX_GPIO_Init(void);

int main(void)
{
    HAL_Init();
    SystemClock_Config();
    MX_GPIO_Init();

    while (1)
    {
        HAL_GPIO_TogglePin(GPIOC, GPIO_PIN_13);
        HAL_Delay(1000);
    }
}

static void MX_GPIO_Init(void)
{
    GPIO_InitTypeDef GPIO_InitStruct = {0};
    __HAL_RCC_GPIOC_CLK_ENABLE();
    
    GPIO_InitStruct.Pin = GPIO_PIN_13;
    GPIO_InitStruct.Mode = GPIO_MODE_OUTPUT_PP;
    GPIO_InitStruct.Pull = GPIO_NOPULL;
    GPIO_InitStruct.Speed = GPIO_SPEED_FREQ_LOW;
    HAL_GPIO_Init(GPIOC, &GPIO_InitStruct);
}

6.2 UART Communication Example

#include "stm32f4xx_hal.h"
#include <string.h>

UART_HandleTypeDef huart1;

int main(void)
{
    HAL_Init();
    SystemClock_Config();
    MX_GPIO_Init();
    MX_USART1_UART_Init();

    char msg[] = "STM32 HAL UART Ready\r\n";
    
    while (1)
    {
        HAL_UART_Transmit(&huart1, (uint8_t*)msg, strlen(msg), HAL_MAX_DELAY);
        HAL_Delay(2000);
    }
}

7. Training Details

7.1 Dataset Composition

Category	Examples	Complexity Levels
LED Control	250	Basic to Advanced
Button Input	200	Basic to Advanced
ADC Reading	200	Intermediate to Advanced
UART Communication	200	Intermediate to Advanced
PWM Generation	150	Advanced
Total	1,000	Multi-level

7.2 Training Configuration

Base Model: Salesforce/codegen-350M-mono
Training Epochs: 6
Batch Size: 12 (effective)
Learning Rate: 3e-5
Training Duration: 29 minutes 49 seconds
Final Training Loss: 0.0841
Final Validation Loss: 0.006074

7.3 Hardware Requirements

Minimum GPU: 8GB VRAM (RTX 3070/4060)
Recommended GPU: 16GB+ VRAM (RTX 4080/4090)
RAM: 16GB+ system memory
Storage: 10GB for model files

8. Limitations and Considerations

8.1 Current Limitations

Generated code should be reviewed before production deployment
May require minor adjustments for specific hardware configurations
Clock configuration may need fine-tuning for precise timing requirements
Advanced features like DMA and complex interrupt handlers need verification

8.2 Best Practices

Always compile and test generated code on actual hardware
Review GPIO pin assignments for your specific board
Verify clock settings match your system requirements
Test peripheral configurations with your hardware setup

9. Use Cases

9.1 Educational Applications

University Courses: Embedded systems and microcontroller programming
Bootcamps: Accelerated embedded development training
Self-Learning: STM32 HAL function reference and best practices
Code Examples: Learning proper STM32 coding patterns

9.2 Professional Development

Rapid Prototyping: Quick generation of boilerplate firmware code
Code Templates: Starting points for embedded projects
Documentation: HAL usage examples and implementation patterns
Team Training: Standardized coding practices across development teams

9.3 Research and Development

IoT Projects: Quick setup of sensor interfaces and communication
Academic Research: Embedded systems research acceleration
Proof of Concepts: Fast implementation of embedded system ideas
Hardware Testing: Quick firmware for hardware validation

10. Model Performance Analysis

10.1 Strengths

High Accuracy: 83.3% success rate on comprehensive tests
Fast Generation: Sub-second code generation
Multiple Families: Support for 5 STM32 product lines
Production Quality: Industry-standard code formatting
Comprehensive Coverage: Wide range of peripheral functions

10.2 Technical Achievements

Syntax Accuracy: 95%+ syntactically correct code
HAL Compliance: Proper STM32 HAL function usage
Complete Projects: Full main.c files with initialization
Best Practices: Following STM32 development guidelines
Scalable Architecture: Easy extension to new peripherals

11. Future Roadmap

11.1 Planned Enhancements

Extended Family Support: STM32H7, STM32WL, STM32U5 series
Advanced Peripherals: DMA, advanced timers, communication protocols
Real-time Features: RTOS integration and real-time applications
Optimization: Performance and power optimization suggestions
Testing Integration: Unit test generation for embedded code

11.2 Community Contributions

We welcome contributions from the embedded systems community:

Dataset Expansion: Submit high-quality STM32 examples
Bug Reports: Report incorrect code generation
Feature Requests: Suggest new peripheral support
Documentation: Improve usage examples and tutorials

12. Citation

If you use this model in your research, projects, or educational materials, please cite:

@model{stm32-hal-codegen-2025,
  title={STM32 HAL Code Generator: A Specialized Language Model for Embedded Systems},
  author={Murat Komurcu},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/MuratKomurcu/stm32-hal-codegen}
}

13. License

This model is released under the MIT License. You are free to use, modify, and distribute this model for both commercial and non-commercial purposes.

13.1 Base Model License

This work is based on Salesforce's CodeGen model, which is licensed under the BSD-3-Clause License. We acknowledge and respect the original license terms.

13.2 Attribution

Base Model: Salesforce/codegen-350M-mono
Fine-tuning: Custom STM32 HAL dataset and training pipeline
Specialized Domain: STM32 embedded systems development

14. Support and Contact

14.1 Community Support

GitHub Issues: Report bugs and feature requests
Documentation: Comprehensive guides and tutorials

14.2 Professional Contact

For commercial licensing, enterprise support, or collaboration opportunities:

LinkedIn: Murat Komurcu
Email: [email protected]

Downloads last month: 12

Safetensors

Model size

0.4B params

Tensor type

F32

Model tree for MuratKomurcu/stm32-hal-codegen

Base model

Salesforce/codegen-350M-mono

Finetuned

(24)

this model