Opened 8 years ago
Closed 8 years ago
#1076 closed defect (wontfix)
Newline-delimited BibTeX keywords containing commas aren't imported properly
| Reported by: | dstillman | Owned by: | simon |
|---|---|---|---|
| Priority: | minor | Milestone: | 1.0.8 |
| Component: | translators | Version: | 1.0 |
| Keywords: | Cc: |
Description
The BibTeX translator supports newline-delimited keywords, but if any of the keyword lines contain commas, the multiple lines before and after the commas get mushed together.
Example attached.
Attachments (1)
Change History (6)
Changed 8 years ago by dstillman
comment:1 Changed 8 years ago by simon
comment:2 Changed 8 years ago by dstillman
EndNote®
http://forums.zotero.org/discussion/3504/problem-importing-keywords-from-endnote/
So in your example, keyword3 and keyword4 are supposed to be part of the same keyword, with a space in between? That is indeed problematic. And any attempt to figure out the format used is probably more trouble than it's worth. I'll check with the user if EndNote® puts a header at the top of the file that we could use to switch on broken-BibTeX mode.
If there's not a good solution here, we can forget about it and just deal with #1075 (and then encourage switchers to use RIS). That one's out of spec too, but at least it's their own format.
comment:3 Changed 8 years ago by simon
It looks like EndNote® BibTeX files have no header, but, at least in X1, they do have the following characteristics that we might be able to use to identify them unambiguously:
- \r\n line endings (probably uncommon for TeX files)
- a multiple of 3 spaces before every field
- entries end in } } followed by (\r\n)*4
Is it worth trying to adjust for EndNote®'s broken BibTeX, or shall we just tell people to use RIS?
comment:4 Changed 8 years ago by dstillman
With #874 now in better shape, I have no problem WONTFIXing this and recommending only RIS in the documentation.
comment:5 Changed 8 years ago by dstillman
- Resolution set to wontfix
- Status changed from new to closed
Created #1080 for the documentation change once 1.0.8 is out.
Dan, where are these BibTeX files coming from? My understanding is that BibTeX parsers are supposed to treat \s+ as a space (see, for example, the examples on Wikipedia), and, from a quick Google search, most implementations appear to do this. Thus, one can imagine a valid entry with:
keywords = {keyword1,keyword2,keyword3 keyword4,keyword5,keyword6,keyword7}which an attempt to fix this issue would likely break. Unless files like the example here are common, I'm hesitant to break support for BibTeX files that expect the parser to collapse whitespace.