PyImport Documentation¶
PyImport is a powerful Python command-line tool for importing CSV data into MongoDB with automatic type detection, parallel processing, and graceful handling of “dirty” data.
Features¶
Automatic type detection - Infers field types from CSV data
Multiple execution strategies - Sync, async, multi-process, and threaded imports
Parallel processing - Split large files and import in parallel for maximum throughput
Graceful error handling - Falls back to strings on type conversion errors
Flexible date parsing - Supports multiple date formats with fast ISO date parsing
Restart capability - Resume failed imports from where they left off
Performance optimized - Recent improvements provide 20-35% faster imports
Quick Start¶
# Generate field file
pyimport --genfieldfile data.csv
# Import to MongoDB
pyimport --database mydb --collection mycol data.csv
# Fast parallel import
pyimport --multi --splitfile --autosplit 8 --poolsize 4 data.csv
Documentation¶
Contents:
- Introduction
- Installation
- Quick Start Guide
- PyImport Python API Documentation
- Command-Line Reference
- Usage
- Basic Options
- MongoDB Connection Options
- PostgreSQL Options (Experimental)
- Field File Options
- CSV Parsing Options
- Data Enrichment Options
- Import Performance Options
- File Splitting Options
- Audit Options
- Restart Options (NEW in v1.10.0)
- Collection Management Options
- Error Handling Options
- Logging and Output Options
- Advanced Options
- Common Usage Patterns
- Performance Tips
- See Also
- Field Files (.tff)
- What is a Field File?
- Generating Field Files
- Field File Format
- Supported Types
- Date Format Strings
- Defaults Section
- Advanced Field File Examples
- Type Conversion Behavior
- Field File Discovery
- Creating Field Files Manually
- Troubleshooting
- Performance Tips
- Field File Examples
- TFF v2.0: Nested Document Mapping (NEW!)
- See Also
- Advanced Usage
Typical Performance¶
Sync: ~24,000-32,000 docs/sec
Async: ~30,000-40,000 docs/sec
Multi-process: ~50,000+ docs/sec
Installation¶
pip install pyimport
Requirements¶
Python 3.11+
MongoDB 4.0+