Opened 10 years ago
Closed 10 years ago
#306 closed defect (fixed)
some New York Times ads prevent page from being recognized
| Reported by: | simon | Owned by: | simon |
|---|---|---|---|
| Priority: | major | Milestone: | 1.0 Beta 2 |
| Component: | ingester | Version: | 1.0 |
| Keywords: | Cc: |
Description
i assume this has something to do with iframes?
Change History (1)
comment:1 Changed 10 years ago by simon
- Resolution set to fixed
- Status changed from new to closed
Note: See
TracTickets for help on using
tickets.
(In [734]) closes #334, Washington Post scraper shouldn't include " - washingtonpost.com" in title
closes #313, Blacklist known ad sites from scraper detection
closes #306, some New York Times ads prevent page from being recognized
closes #308, attachment import bug
currently, the ad site blacklist is located at the top of ingester/browser.js. at some point, we may want to switch this to a database table.