trafilatura
Package — ScrapingPython 3.7+Intermediate
Extract main text content from web pages; robust article extraction
Quick Info
- Documentation
- Official Docs
- Python Version
- 3.7+
- Dependencies
- lxml, urllib3, certifi, courlan, htmldate, justext
- Install
pip install trafilatura
Learn by Difficulty
Quick Example
python
# Install: pip install trafilatura import trafilatura # Basic trafilatura usage print(f"Using trafilatura") # See documentation for detailed examples
trafilatura is a third-party package. Extract main text content from web pages; robust article extraction. Install with: pip install trafilatura
Try in PlaygroundTags
packageweb-scrapingparsingdata-extraction