Train Output
Training Approaches¶
IdeaWeaver supports two flexible training approaches:
1. Config-Based Training¶
Using a YAML configuration file for complex setups:
ideaweaver train --config configs/config.yml --verbose
Configuration Used (YAML-based training)¶
📋 Final configuration:
project_name: text-classifier
task: text_classification
backend: local
base_model: google/bert_uncased_L-2_H-128_A-2
dataset: ./datasets/training_data.csv
params:
epochs: 3
batch_size: 8
learning_rate: 2e-05
max_seq_length: 128
eval_strategy: no
save_total_limit: 1
data:
train_split: train
hub:
push_to_hub: false
method: sft
tracking:
enabled: false
2. Command Line Training¶
Pass all parameters directly via command line for quick experiments:
ideaweaver train \
--model google/bert_uncased_L-2_H-128_A-2 \
--dataset ./datasets/training_data.csv \
--task text_classification \
--project-name cli-final-test \
--epochs 1 \
--batch-size 4 \
--learning-rate 2e-05 \
--verbose
Command Line Training Output¶
Here's the successful output from the command line training approach:
🤗 Using model: google/bert_uncased_L-2_H-128_A-2
🎯 Task: text_classification
📋 Final configuration:
backend: local
params: {'epochs': 1, 'batch_size': 4, 'learning_rate': 2e-05, 'max_seq_length': 128}
data: {'train_split': 'train'}
hub: {'push_to_hub': False}
base_model: google/bert_uncased_L-2_H-128_A-2
task: text_classification
dataset: ./datasets/training_data.csv
project_name: cli-final-test
tracking: {'enabled': False}
🚀 Starting model training...
📝 Training config written to: /tmp/training_config.yml
📋 Training configuration:
backend: local
base_model: google/bert_uncased_L-2_H-128_A-2
data:
column_mapping:
target: target
text: text
path: ./autotrain_projects/cli-final-test
train_split: train
valid_split: null
log: tensorboard
params:
batch_size: 4
epochs: 1
lr: 2.0e-05
max_seq_length: 128
project_name: cli-final-test
task: text_classification
🔧 Running command: autotrain --config /tmp/training_config.yml
...
[Training progress with successful completion]
...
📊 Loading metrics from trainer state: ./cli-final-test/checkpoint-1/trainer_state.json
✅ Successfully loaded metrics from trainer state
============================================================
🎉 TRAINING SUMMARY
============================================================
📂 Model Path: ./cli-final-test
🤖 Base Model: google/bert_uncased_L-2_H-128_A-2
📊 Dataset: ./autotrain_projects/cli-final-test
📊 KEY PERFORMANCE METRICS
----------------------------------------
📉 Final Train Loss: 1.0869
🎯 Overall Accuracy: 40.0%
============================================================
✨ Training completed successfully! Model is ready for use.
============================================================
✅ Training completed successfully!
📁 Model saved to: ./cli-final-test
Key Metrics Achieved¶
Metric | Value | Notes |
---|---|---|
Final Training Loss | 1.0869 | Successfully extracted from trainer_state.json |
Overall Accuracy | 40.0% | Evaluation accuracy on validation set |
Training Epochs | 1.0 | Early stopping or completed epoch |
Model Size | 17MB | Compact BERT model (2 layers, 128 hidden units) |
Dataset Size | 24 samples | 19 training + 5 validation samples |
Technical Details¶
Model Architecture¶
- Base Model:
google/bert_uncased_L-2_H-128_A-2
- Task: Text Classification (Sentiment Analysis)
- Classes: 3 (positive, negative, neutral)
- Parameters: Small BERT with 2 layers and 128 hidden units
Training Configuration¶
- Learning Rate: 2e-05
- Batch Size: 8
- Max Sequence Length: 128
- Optimizer: AdamW
- Scheduler: Linear
- Early Stopping: Enabled (patience: 5, threshold: 0.01)
Data Processing¶
- Training Split: 19 samples
- Validation Split: 5 samples
- Text Column:
autotrain_text
- Target Column:
autotrain_label
Files Generated¶
Model Files¶
./text-classifier/model.safetensors
(17MB)./text-classifier/config.json
./text-classifier/tokenizer.json
./text-classifier/tokenizer_config.json
Training Artifacts¶
./text-classifier/checkpoint-3/trainer_state.json
- Contains detailed training metrics./datasets/text-classifier/
- Copy of processed training data- TensorBoard logs for visualization
Verification Commands¶
# Verify model files
ls -la ./text-classifier/
# Check trainer state metrics
cat ./text-classifier/checkpoint-3/trainer_state.json | jq '.log_history[-1]'
# Test model loading
python -c "from transformers import AutoTokenizer, AutoModelForSequenceClassification; tokenizer = AutoTokenizer.from_pretrained('./text-classifier'); model = AutoModelForSequenceClassification.from_pretrained('./text-classifier'); print('✅ Model loads successfully')"
Success Indicators¶
✅ Metrics Display Fixed: Real training metrics now shown instead of "Not available"
✅ Model Training: Successfully trained sentiment classification model
✅ File Generation: All model files created correctly
✅ Data Processing: Dataset processed and split appropriately
✅ Trainer State: Detailed training history preserved in JSON format
Next Steps¶
- Model Evaluation: Test the trained model on new data
- Fine-tuning: Experiment with hyperparameters for better accuracy
- Deployment: Deploy the model for inference
- Integration: Use with RAG or other IdeaWeaver features
Example: Training Output with MiniLM (Command-Line)¶
Below is a sample output from running the training command with the Microsoft MiniLM model:
ideaweaver train \
--model microsoft/MiniLM-L6-H384-uncased \
--dataset ./datasets/training_data.csv \
--task text_classification \
--project-name cli-minilm-test \
--epochs 1 \
--batch-size 4 \
--learning-rate 2e-05 \
--verbose
Note: If the model (
microsoft/MiniLM-L6-H384-uncased
) is not present locally, it will be automatically downloaded from Hugging Face the first time you run the command.
Sample Output¶
🤗 Using model: microsoft/MiniLM-L6-H384-uncased
🎯 Task: text_classification
📋 Final configuration:
backend: local
params: {'epochs': 1, 'batch_size': 4, 'learning_rate': 2e-05, 'max_seq_length': 128}
data: {'train_split': 'train'}
hub: {'push_to_hub': False}
base_model: microsoft/MiniLM-L6-H384-uncased
task: text_classification
dataset: ./datasets/training_data.csv
project_name: cli-minilm-test
tracking: {'enabled': False}
🚀 Starting model training...
📝 Training config written to: /tmp/tmppyzh5clq.yml
📋 Training configuration:
backend: local
base_model: microsoft/MiniLM-L6-H384-uncased
data:
column_mapping:
target: target
text: text
path: ./autotrain_projects/cli-minilm-test
train_split: train
valid_split: null
log: tensorboard
params:
batch_size: 4
epochs: 1
lr: 2.0e-05
max_seq_length: 128
project_name: cli-minilm-test
task: text_classification
🔧 Running command: autotrain --config /tmp/tmppyzh5clq.yml
INFO | ... - Using AutoTrain configuration: /tmp/tmppyzh5clq.yml
INFO | ... - Running task: text_multi_class_classification
INFO | ... - Using backend: local
INFO | ... - Starting local training...
INFO | ... - loading dataset from disk
INFO | ... - Starting to train...
INFO | ... - {'loss': 1.1103, 'grad_norm': 2.46, 'learning_rate': 2e-05, 'epoch': 0.33}
INFO | ... - {'loss': 1.126, 'grad_norm': 3.26, 'learning_rate': 1.75e-05, 'epoch': 0.67}
INFO | ... - {'loss': 1.0338, 'grad_norm': 5.06, 'learning_rate': 1.5e-05, 'epoch': 1.0}
INFO | ... - {'eval_loss': 1.0817, 'eval_f1_macro': 0.19, 'eval_f1_micro': 0.4, 'eval_accuracy': 0.4, 'epoch': 1.0}
INFO | ... - {'loss': 1.0702, ...}
INFO | ... - {'loss': 1.1341, ...}
INFO | ... - {'loss': 1.1244, ...}
INFO | ... - {'eval_loss': 1.0829, 'eval_accuracy': 0.4, 'epoch': 2.0}
INFO | ... - {'loss': 1.1247, ...}
INFO | ... - {'loss': 1.0579, ...}
INFO | ... - {'loss': 1.0865, ...}
INFO | ... - {'eval_loss': 1.0832, 'eval_accuracy': 0.4, 'epoch': 3.0}
INFO | ... - {'train_runtime': 2.29, 'train_loss': 1.0964, 'epoch': 3.0}
INFO | ... - Finished training, saving model...
INFO | ... - Job ID: 36417
============================================================
🎉 TRAINING SUMMARY
============================================================
📂 Model Path: ./cli-minilm-test
🤖 Base Model: microsoft/MiniLM-L6-H384-uncased
📊 Dataset: ./autotrain_projects/cli-minilm-test
📊 KEY PERFORMANCE METRICS
----------------------------------------
📉 Final Train Loss: 1.0338
🎯 Overall Accuracy: 40.0%
============================================================
✨ Training completed successfully! Model is ready for use.
============================================================
✅ Training completed successfully!
📁 Model saved to: ./cli-minilm-test