Serialization formats
One of the most frequently required features when implementing scrapers is being able to store the scraped data properly and, quite often, that means generating an “export file” with the scraped data (commonly called “export feed”) to be consumed by other systems.
Storage backends
Local filesystem
The feeds are stored in the local filesystem.
URI scheme: file
Example URI: file:///tmp/export.csv
Required external libraries: none
FTP
The feeds are stored in a FTP server.
URI scheme: ftp
Example URI: ftp://user:pass@ftp.example.com/path/to/export.csv
Required external libraries: none
S3
The feeds are stored on Amazon S3.
URI scheme: s3
Example URIs:
s3://mybucket/path/to/export.csv
s3://aws_key:aws_secret@mybucket/path/to/export.csv
Required external libraries: botocore or boto