Drone Firmware Tampering Detection — Project Guide

This document explains how to start the project, how all components connect, how models and training work, firmware generation, firmware comparison, model accuracy, and what each screenshot shows.

1. How to Start the Project

Prerequisites

Steps

  1. Start the backend (first terminal):
    cd backend
    pip install -r requirements.txt
    uvicorn main:app --reload --host 0.0.0.0 --port 8000
    The API runs at http://localhost:8000.
  2. Run the desktop app (second terminal, from project root):
    .\venv\bin\Activate.ps1
    python main.py
    On Windows with a Scripts venv: .\venv\Scripts\Activate.ps1 then python main.py.

The window title will be Drone Firmware Tampering Detection. Ensure the status in the sidebar shows Connected so the app can talk to the backend.

2. How Everything Is Connected

High-level flow:

So: Desktop (PySide6)APIClientBackend (FastAPI). Training is separate: it is done in the Jupyter notebook and writes model files that the desktop can use for local triage.

3. How All Files Are Connected

File / FolderRole
main.pyEntry point; creates QApplication and MainWindow.
config.pyAPI URL and timeouts; imported by api/client.py.
api/client.pyAPIClient used by all views and by main_window (for health check).
views/main_window.pyImports all views, builds nav and stack, wires _show("dashboard"|"upload"|...) and analysis_selected → detail.
views/dashboard.pyUses API to get stats and recent analyses; emits analysis_selected(firmware_id) to open detail.
views/upload.pyUses APIClient to upload and analyze; model choice (ensemble, etc.) sent to analyze_firmware.
views/analyses.pyLists analyses via API; clicking a row emits analysis_selected so main_window shows detail.
views/analysis_detail.pyLoads one analysis by ID via API; can call get_analysis_features, generate report, run local triage via utils/local_model.
views/research_view.pyCompares two firmware files: uploads both, runs analysis on each, shows baseline vs modified result (firmware comparison).
views/generator.pyContains FirmwareGenerator and GeneratorView UI; generates firmware binaries and saves to a folder.
utils/local_model.pyLoads models/drone_firmware_lr.joblib, scaler, metadata; provides local_predict and get_local_feature_importance for analysis detail.
backend/main.pyImplements all API routes; stores uploads under backend/uploads/ and analysis results in memory.
drone_firmware_training.ipynbLoads DJI dataset, cleans/encodes, trains Logistic Regression (and optional XGBoost), saves model/scaler/metadata to models/.
models/Holds drone_firmware_lr.joblib, drone_firmware_scaler.joblib, drone_firmware_metadata.joblib produced by the notebook.

4. How the Models Work

The system uses two places where “models” appear:

Why These Models Are Used

ModelUsefulness
EnsembleCombines several models (e.g. RF + LR) to reduce variance and improve robustness; good for tamper vs clean with limited data.
Random Forest (rf)Handles mixed and categorical features, gives feature importance, less overfitting than a single tree.
Logistic Regression (lr)Simple, interpretable coefficients (explainability), stable with regularization; used as the saved local model.
LSTMDesigned for sequences; useful if firmware is treated as a sequence of bytes or chunks (optional/advanced).
AutoencoderUnsupervised; learns “normal” firmware and flags anomalies (tampered) by reconstruction error; useful when labels are scarce.

The UI shows per-model results (e.g. rf: clean 21.4%, lr: clean 19.5%) and feature importance (e.g. risk_score, verification_passed, section_count, entropy_score) so you can see how each model contributes and which features drive the decision.

5. How Training Is Done

Training is done entirely in drone_firmware_training.ipynb, not in the desktop or backend.

  1. Data — Notebook loads dataset/dji_firmware_dataset_v2/dji_firmware_dataset/ (train.csv, val.csv, test.csv).
  2. Cleaning — Drops leaky/ID columns, converts booleans to int, fills missing values, encodes categoricals with LabelEncoder.
  3. Features — Uses a fixed list (e.g. size_kb, entropy_score, section_count, verification_passed, risk_score, plus categoricals like drone_model, file_extension). Target is label (0 = clean, 1 = tampered).
  4. Model — Trains LogisticRegression(C=0.1, class_weight='balanced') on scaled features (StandardScaler), evaluates on train/val/test (accuracy, precision, recall, F1, AUC).
  5. Export — Saves model, scaler, and metadata (feature columns, label encoders) to models/drone_firmware_*.joblib. The desktop’s local triage uses these files.

6. Model Accuracy

Accuracy is computed in the notebook on the train/val/test sets after training. XGBoost evaluation results:

XGBoost Train — Acc: 0.7519  F1: 0.0000  AUC: 0.9680
XGBoost Val   — Acc: 0.7529  F1: 0.0000  AUC: 0.9784
XGBoost Test  — Acc: 0.7416  F1: 0.0000  AUC: 0.9921

XGBoost Test confusion matrix:
[[66  0]
 [23  0]]

The backend currently returns mock tamper/clean and probabilities, so accuracy in the live app is not from this evaluation pipeline until the backend is wired to the same model or to a real ML service.

7. How Firmware Generation Works (Code to Look At)

Firmware is generated in the Generate tab. The logic lives in views/generator.py, class FirmwareGenerator.

So: views/generator.py (FirmwareGenerator + GeneratorView) is the only place that creates firmware; no backend call is involved.

8. How Firmware Comparison Works

Firmware comparison is in the Research tab (views/research_view.py).

  1. User selects baseline and modified firmware files (e.g. original and padded/altered).
  2. On “Upload and compare”, CompareWorker (QThread) runs: uploads baseline → gets firmware_id → calls analyze_firmware(fid1, "ensemble"); same for modified file.
  3. It then fetches full analysis for both IDs via get_analysis and emits (d1, d2) to the UI.
  4. The UI shows for each: Tampered/Clean and probability (%), and a short summary: “Detector outcome differs” or “Same classification”.

So comparison = two independent analyses (same backend/ensemble), then side-by-side result. There is no byte-by-byte diff; the “comparison” is of the detector’s decision on each file.

9. Screenshots and What Each Explains

Each screenshot is described below exactly as it appears in the application. Image files are in the doc/ folder.

Screenshot 2026-03-08 20135755.png

What it shows: The Firmware Generator screen (Generate tab is highlighted in the sidebar). The sidebar shows “Drone Firmwa”, navigation buttons (Dashboard, Upload, Analyses, Research, Generate), and “Connected” at the bottom.

What it explains: The configuration panel: Drone Model is “Mavic 3”, Firmware Version is “1.0.0”. Both Encrypt Firmware (Advanced XOR Cipher) and Sign Firmware (Simulated RSA) are checked. The output path is C:\Reaserch\RP connect 42\IT22291646\generated_firmware with a “Change Folder” button. The green Generate Firmware button is below. The log area at the bottom shows the placeholder text “Generation logs will appear here…” (no generation has been run yet). This is the screen used to create unique, optionally encrypted and signed firmware for testing.

Firmware Generator - Configuration

Screenshot 2026-03-08 20140341.png

What it shows: The Upload Firmware screen. Title “Upload Firmware” at the top, with the line “Upload firmware files (.csv, .bin, .hex, .elf) for tampering detection.”

What it explains: The main upload area in the center says “Drag & drop firmware file here or click to browse”. At the bottom left, Model: is shown with the dropdown open, listing the analysis models: ensemble (selected and highlighted), random_forest, lstm, and autoencoder. To the right, the words “Upload” and “Analyze” appear faded, indicating they are disabled until a file is selected. This screen is where you choose which ML model the backend will use for tampering detection and then upload a firmware file.

Upload Firmware - Model selection

Screenshot 2026-03-08 20135634.png

What it shows: The Dashboard (Dashboard is the active item in the sidebar). Window title “Drone Firmware Tampering Detection”; sidebar shows “Drone Firmwa”, the five nav buttons, and “Connected” at the bottom.

What it explains: The three summary cards show: Total Analyses: 2 (blue), Tampered Detected: 1 (red), and Clean Firmware: 1 (green). Under “Recent Activity” there are two entries: (1) fw_Mavic_3_1.0.2_1772958292.bin with ID 8c9601a5... and a green CLEAN badge; (2) fw_Mavic_3_1.0.0_1772957442.bin with ID da7d4c57... and a red TAMPERED badge. A “Refresh Data” button is at the top right. This is the overview of analysis results and how the model has classified each firmware file.

Dashboard - Stats and Recent Activity

Screenshot 2026-03-08 135728.png

What it shows: The Robustness / Research screen (Research is highlighted in the sidebar). No firmware files have been selected yet.

What it explains: The text explains: “Compare baseline and modified drone firmware to evaluate detector robustness. Upload two files (e.g. original and padded or altered) and run comparison.” The Compare two firmware files section has two rows: both show “No file selected”, with buttons Select baseline firmware and Select modified firmware. Below that is the Upload and compare button. The Comparison result section is empty and shows the placeholder “Run a comparison to see results.” This is the initial state of the firmware comparison (robustness evaluation) flow.

Research - No files selected

Screenshot 2026-03-08 20135744.png

What it shows: The same Robustness / Research screen after two files were selected and the comparison was run.

What it explains: The baseline field shows fw_Mavic_3_1.0.0_1772957522.bin and the modified field shows fw_Mavic_3_1.0.2_1772958292.bin. The Comparison result area now displays: Baseline: Clean 18.0%, Modified: Tampered 88.5%, and the line “Result: Detector outcome differs between baseline and modified.” So the detector classified the baseline as clean (low tampering probability) and the modified file as tampered (high probability), and the UI summarizes that the outcomes differ. This illustrates how firmware comparison works: two analyses, then a side-by-side result and a short summary.

Research - Comparison result

Screenshot 2026-03-08 20135721.png

What it shows: The Analysis Detail view for the file fw_Mavic_3_1.0.2_1772958292.t. Header has “← Back”, the filename, and buttons “Retry Analysis” and “Generate PDF Report”. It states MODEL: ensemble.

What it explains: (1) Uncertainty — A progress bar at about 29% with the label “Confidence: Low”, showing the ensemble’s confidence for this analysis. (2) Per-model results — Table: model rf prediction “clean” at 21.4%, model lr prediction “clean” at 19.5%. (3) Feature importance (explainability) — risk_score 0.3396, verification_passed -0.1949, section_count 0.1332, entropy_score 0.1284. (4) Recommendations — e.g. “Verify firmware signature with vendor” (evidence: signature_verification_failed, risk_score) and “Compare section layout to known-good baseline” (evidence: section_count, entropy_score). (5) Features analyzed (drone firmware) — entropy_score 4.59, section_count 19, verification_passed 1, risk_score 0.77. (6) Local triage — “Run a quick local check using the trained model” and a “Run local check” button. (7) Section entropy / layout — A horizontal bar chart with axis 0.0 to 8.0 and several bars for section entropy. This screen shows how the model output is presented and how explainability and recommendations are tied to evidence.

Analysis Detail - Full report

Screenshot 2026-03-08 20135812.png

What it shows: The Firmware Generator screen (Generate tab highlighted) after a firmware file has been successfully generated.

What it explains: Configuration is the same (e.g. Drone Model “Mavic 3”, Firmware Version “1.0.4”, both Encrypt and Sign checkboxes checked, output path and “Change Folder”). The important part is the log panel at the bottom: it shows a timestamp [13:58:01], the line Generated: fw_Mavic3_v1.0.4_1772958481.bin, the full path C:\Reaserch\RP connect 42\IT22291646\generated_firmware\fw_Mavic3_v1.0.4_1772958481.bin, Encrypted: Yes, and Signed: Yes. This confirms that generation ran successfully and that the file was encrypted and signed as requested. Use this screenshot to explain the generation output and naming (model, version, timestamp) and where to find the generated file.

Firmware Generator - Log after generation

10. Quick Reference