Opened 10 years ago

Closed 10 years ago

#351 closed enhancement (fixed)

Scrapers with PDF downloads should use downloadAssociatedFiles instead of automaticSnapshots

Reported by: dstillman Owned by: simon
Priority: major Milestone: 1.0 Beta 3
Component: ingester Version: 1.0
Keywords: Cc:

Description

HTML downloads can stay with automaticSnapshots -- PDF (and any other large file) downloads should use the separate pref, which will default to off.

(This will address the reports of scraping being slow with automaticSnapshots on, which may just be people's network speed when downloading large files.)

Related to #327

Change History (1)

comment:1 Changed 10 years ago by simon

  • Resolution set to fixed
  • Status changed from new to closed

(In [939]) - closes #327, scrapers should either take snapshots or use URL field

  • closes #351, scrapers with PDF downloads should use downloadAssociatedFiles instead of automaticSnapshots

there are some problems with snapshot titles. see bug #436.

Note: See TracTickets for help on using tickets.