Opened 9 years ago
Last modified 6 years ago
#832 assigned enhancement
Handle title case/sentence case properly
| Reported by: | codec | Owned by: | simon |
|---|---|---|---|
| Priority: | major | Milestone: | |
| Component: | export | Version: | 1.5 |
| Keywords: | Cc: | erazlogo |
Description (last modified by simon)
Some styles require title case ("Article Title Here"), while others require sentence case ("Article title here"). At the moment, a transform to title case is implemented (although it's not customizable). Unfortunately, sentence case is a much harder situation. Transforming titles like "Glycogen: a Trojan horse for neurons" and "Characterization of the SKN7 ortholog of Aspergillus fumigatus" from their title case equivalents will not be possible by algorithms alone.
Suppose we provide an icon to toggle between title/sentence case in the edit pane. Then, if the user modifies something beyond the capitalization in either form, we determine the minimum number of deletes and inserts to transform it (classic dynamic programming problem), and if this includes insertion of a new word, we capitalize it/put it in lower case as appropriate. This will require some modifications to the database schema, but is, as far as I can tell, the most intuitive all-encompassing solution.
Alternatively, we could leave the entry UI as is, and use BibTeX-style curly braces to specify that letters should be capitalized as they are. This is the easier solution, but is somewhat lacking in terms of usability.
Change History (14)
comment:1 Changed 9 years ago by simon
- Component changed from uncategorized to export
- Milestone set to 1.5 Alpha 1
- Owner changed from dstillman to simon
- Status changed from new to assigned
comment:2 Changed 9 years ago by codec
Yes, no problem. In practice I don't think its a big deal, but I thought it worth recording.
comment:3 Changed 9 years ago by simon
- Description modified (diff)
- Priority changed from minor to major
- Summary changed from enforce-case formatting is supported. to Handle title case/sentence case properly
comment:4 Changed 9 years ago by simon
- Description modified (diff)
comment:5 Changed 9 years ago by simon
- Version changed from 1.0 to 1.5
comment:6 Changed 9 years ago by erazlogo
- Cc erazlogo added
From an IM w/ Sean: Another option to consider is to have a preference to enable/disable CSL casing for titles if it's in a CSL style, so users could just format titles via "transform text" in the info pane (and edit errors themselves) instead of relying on CSL. (Perhaps create the pref specifically for titles because casing for "type" and "location in archive" should be done in CSL)
comment:7 in reply to: ↑ description Changed 9 years ago by bdarcus
I'm not sure if this is a known bug, but I just copied and pasted a title-case-transform title to a document. The title in the document is all lowercase. Not good.
comment:8 Changed 9 years ago by dstillman
Bruce, what do you mean? Copied from where?
comment:9 Changed 8 years ago by dstillman
http://daringfireball.net/2008/05/title_case lists some good heuristics for doing proper case conversions, such as skipping over the word if it already contains capitalized letters other than the first character (to deal with iPhone, etc.) and skipping domain names.
comment:10 Changed 8 years ago by simon
Is there any resolution on this? For 1.5, do we want to go with the more elegant route (in the description) that will require some UI/DB changes, or rely on dumb algorithms that the user can easily turn off?
comment:11 Changed 8 years ago by tjowens
Many of the data sources we work with output author names in all caps. It would probably make sense to offer the same text transformation options we offer for titles for names. See the request here. http://forums.zotero.org/discussion/1017/?Focus=15978#Comment_15978
comment:12 Changed 8 years ago by dstillman
Pretty good JS algorithm in response to Gruber's title case request (above): http://individed.com/code/to-title-case/
English-only, and wouldn't obviate the need for manual tweaking, but a good place to start for converting to title case.
Can you outline the DB changes that would be required for the elegant solution? Also, what about users who turn off the capitalizeTitles pref? Would we need to auto-detect the original format used in those cases (or specify it in the translator)?
comment:13 Changed 8 years ago by simon
That script looks useful.
I'd say that auto-detecting the original format is a necessity. Any algorithm will have less success converting title->sentence case than with sentence->title case because of issues with proper nouns ("The Zotero Quick Start Guide" would probably become "The zotero quick start guide", although we might be able to make use of the spell checker to avoid some of this), so we want to avoid discarding any data.
The simplest way to store this in the DB would involve separate title case and sentence case strings for a few fields (title, series title, book title, short title). If we auto-detect the original case, then we can deal with the pref by saving a bit indicating which case is verbatim. When the pref is off, we use the title that's verbatim. (This doesn't work for ALL CAPS titles, which we'd want to convert before saving, but I don't know whether converting ALL CAPS titles regardless of the pref is really a bad idea.)
I think we need a second pref for which title to display by default in the Zotero pane, with title case, sentence case, and verbatim options. Turning the pref off would necessitate the verbatim option. Users could switching between title and sentence case titles in the info pane by clicking a button next to the field.
comment:14 Changed 6 years ago by dstillman
- Milestone 2.0 Beta 3 deleted
Milestone 2.0 Beta 3 deleted
Right. This is not implemented because it's very difficult to implement properly, and we haven't had to get to it quite yet.