should we remove duplicates? (devil's in the details)

Jodi Schneider Jan 12, 2010 12:24:53 PM
I've noticed some duplicate items. I deleted one where there was a duplicate in the title (deleting one with a typo-a space missing in the title). In general, it's difficult to decide which item to keep, because while the titles and gist of the metadata may match, there are differences. For instance, look at "A content-driven reputation system for the Wikipedia"--one has the correct proceedings title, the other has an abstract. So there's some work to merge items and get the best result. Maybe choosing based on repository makes sense? i.e. trust ACM over unAPI? -Jodi
Chitu Okoli Mar 9, 2010 9:27:53 PM
The problem with pruning duplicates is that if anyone has already cited some, then pruning them breaks some files on update. I would wait until Zotero implements duplicate detection and merging as a public feature (I know it's already functional internally), hopefully in version 2.1. Then there would be a more consistent way of dealing with this problem. To me, duplicate handling is the #1 most needed feature of Zotero right now.