From d5f5e93f4fc8bf996d820accd8bb4136c97af654 Mon Sep 17 00:00:00 2001
From: Nicolas Mowen <nickmowen213@gmail.com>
Date: Mon, 1 Dec 2025 10:51:03 -0700
Subject: [PATCH] Update classification docs for training recommendations

---
 .cspell/frigate-dictionary.txt                                | 1 +
 .../custom_classification/state_classification.md             | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/.cspell/frigate-dictionary.txt b/.cspell/frigate-dictionary.txt
index 6e66a4704..329c41815 100644
--- a/.cspell/frigate-dictionary.txt
+++ b/.cspell/frigate-dictionary.txt
@@ -191,6 +191,7 @@ ONVIF
 openai
 opencv
 openvino
+overfitting
 OWASP
 paddleocr
 paho
diff --git a/docs/docs/configuration/custom_classification/state_classification.md b/docs/docs/configuration/custom_classification/state_classification.md
index 66d3e60ca..927fe91af 100644
--- a/docs/docs/configuration/custom_classification/state_classification.md
+++ b/docs/docs/configuration/custom_classification/state_classification.md
@@ -69,4 +69,6 @@ Once all images are assigned, training will begin automatically.
 ### Improving the Model
 
 - **Problem framing**: Keep classes visually distinct and state-focused (e.g., `open`, `closed`, `unknown`). Avoid combining object identity with state in a single model unless necessary.
-- **Data collection**: Use the model’s Recent Classifications tab to gather balanced examples across times of day and weather.
+- **Data collection**: Use the model's Recent Classifications tab to gather balanced examples across times of day and weather.
+- **When to train**: Focus on cases where the model is entirely incorrect or flips between states when it should not. There's no need to train additional images when the model is already working consistently.
+- **Selecting training images**: Images scoring below 100% due to new conditions (e.g., first snow of the year, seasonal changes) or variations (e.g., objects temporarily in view, insects at night) are good candidates for training, as they represent scenarios different from the default state. Training these lower-scoring images that differ from existing training data helps prevent overfitting. Avoid training large quantities of images that look very similar, especially if they already score 100% as this can lead to overfitting.