Opened 10 years ago

Closed 10 years ago

#306 closed defect (fixed)

some New York Times ads prevent page from being recognized

Reported by: simon Owned by: simon
Priority: major Milestone: 1.0 Beta 2
Component: ingester Version: 1.0
Keywords: Cc:

Description

i assume this has something to do with iframes?

Change History (1)

comment:1 Changed 10 years ago by simon

  • Resolution set to fixed
  • Status changed from new to closed

(In [734]) closes #334, Washington Post scraper shouldn't include " - washingtonpost.com" in title
closes #313, Blacklist known ad sites from scraper detection
closes #306, some New York Times ads prevent page from being recognized
closes #308, attachment import bug

currently, the ad site blacklist is located at the top of ingester/browser.js. at some point, we may want to switch this to a database table.

Note: See TracTickets for help on using tickets.