Opened 6 years ago
Closed 6 years ago
#1770 closed defect (fixed)
Refresh triggers "this._session.style is undefined" or "citations is undefined" error in citeproc-js
| Reported by: | fbennett | Owned by: | simon |
|---|---|---|---|
| Priority: | critical | Milestone: | |
| Component: | word integration | Version: | 2.1 |
| Keywords: | Cc: |
Description (last modified by fbennett)
This was reported in the following thread:
http://forums.zotero.org/discussion/10916/word-plugin-error-thissessionstyle-is-undefined/
During testing of the multilingual branch, I came up with a simple document that would reliably reproduce this error when reconnected to Zotero. There turned out to be three layers to the problem. First, I found a loop constraint variable mismatch in citeproc-js (now fixed, in the version shipped with 2.1beta5). Second, the for loop in the Zotero-side restoreProcessorState function was attempting to iterate over an integer rather than a list. Fixing these two problems caused the processor crash to occur immediately, rather than on a second refresh attempt. The third issue was in the updateIndices and newIndices objects on Zotero-side. Citation pointers were being set in both objects during a refresh, leading updateCitations() to attempt to perform a second refresh cycle, which triggered the crash.
With the attached patch, document refreshes appear to be solid. This should probably be tested and pushed to clients soon, since early adopters of 2.1beta are likely to be frustrated by it.
I have the document here, and a small Zotero database that drives it, if needed for testing. I've confirmed that it triggers the crash in Zotero 2.1b5.
Attachments (3)
Change History (21)
Changed 6 years ago by fbennett
comment:1 Changed 6 years ago by fbennett
- Description modified (diff)
comment:2 Changed 6 years ago by simon
The restoreProcessorState call is disabled for now (although we might be able to enable it thanks to citeproc-js 1.0.101) so AFAIK the changes there shouldn't make a difference. I need to go over this code again to vet the remainder of the patch.
comment:3 Changed 6 years ago by simon
I'm still a little confused about this one. We shouldn't be calling processCitationCluster twice on the same citations, since formatCitation should cache them in this.citationText, and even if we are, I'm unclear why this should lead to a processor crash. Can you email me the test doc?
comment:4 Changed 6 years ago by simon
Or better yet, attach it here.
Changed 6 years ago by fbennett
comment:5 Changed 6 years ago by fbennett
The document and a database to drive it are up, as RestoreProcessorState-kit.zip. I'll confess that I didn't quite get to the point of fully understanding the problem, but overlapping itemIDs in updateItems and newItems were the trigger. The cause, on the processor side, was that one of the three references in the document went missing on the second refresh, so that a call to its itemID crashed the processor.
My debugging skills are pretty primitive, and I quickly get exhausted with the complexities of the two systems on this one, but I agree that it would be nice the work out more completely what the cause is.
comment:6 Changed 6 years ago by fbennett
I haven't check this idea against the code yet, but I think I may have figured this one out. One of the things I noticed in integration.js is a variable citationsByIndex. citeproc-js has a similarly named variable citationByIndex, which is managed by registry operations. I think what's happening is that integration.js is taking the return value from processCitationCluster() was the list of citations, and via citationsByIndex (the integration.js variable), that's being fed back into restoreProcessorState() on update. But the return value from processCitationCluster() isn't the full list of citations, but a list of citations that require an update.
The order of citations in the sample doc Test.odt was changed with a cut-and-paste rearrangement immediately before saving. and the sequence numbers on the citations are incorrect. These are corrected by citeproc-js, and it returns only the two citations that need to be updated in the doc. Not quite sure what the actual processing sequence is, but that mismatch would trigger the error that we're getting.
comment:7 Changed 6 years ago by fbennett
In that last post, "but a list of citations that require an update" should have read "but a list of citations that require an update in the document". (The citationByIndex variable is managed by the processor, and should not be written into directly.)
comment:8 Changed 6 years ago by fbennett
Okay, progress. restoreProcessorState() runs processCitationCluster() after installing citation data, in order to reconcile citation numbers. I had missed that the return value is significant, since the updated note numbers etc need to be written back into the document. I'm giving restoreProcessorState() a return value now. It should start making sense now.
comment:9 Changed 6 years ago by fbennett
I may not be understanding things completely, but it seems to me that the code in integration.js may be replicating a number of things that the processor handles internally. When reconciling to the document on refresh, all you should need to do is do _getFields() and feed the available citation data to restoreProcessorState(). Once the processor is loaded with data, it should be possible to draw session data directly from its internal objects and arrays. I've added a little clarification to the processor manual:
http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#runtime-state-internal-objects-and-arrays
comment:10 Changed 6 years ago by simon
That's quite possible. Will it properly identify citations that have moved between restoreProcessorState() calls? How about citations that have been duplicated by copy/paste and thus have the same citation IDs?
comment:11 Changed 6 years ago by fbennett
A bit of poking around in response to this forum post led me to add a convenience function to the registry, that might be useful for the "Edit Bibliography" function:
http://forums.zotero.org/discussion/16219/authordate-without-author-or-date
The new method is available in version 1.0.103 of the processor, and I've added a description to the processor manual:
http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#runtime-state-internal-objects-and-arrays
comment:12 Changed 6 years ago by fbennett
Duplicate citation IDs, nice one. I'll build a test, and get back. We should be able to spin a new citation ID when a duplicate is encountered, but I don't think that happens at the moment.
When restoreProcessorState() runs processCitationCluster() on the first citation in the document, it should return rendered citations and index numbers for any citations that are affected by reshuffling.
comment:13 Changed 6 years ago by fbennett
A closer look showed that restoreProcessorState() was lacking, so I've made it sensitive to the former arrangement of citations, in release 1.0.104. For generating citation IDs using your native function, you can overload the function citeproc.setCitationId(), found in util_integration.js. It takes a citation object and a flag as arguments. If the object has a citationID and the flag has no value, the function should not assign an ID. If the flag is true, assignment of a fresh ID should be forced.
The new logic in restoreProcessorState() hasn't been thoroughly tested yet, but it should work, but what it does is pretty simple, and it should be easy to iron out any problems.
comment:14 Changed 6 years ago by simon
I think that this should reproduce the error without Zotero, although I can't quite tell, since Rhino doesn't appear to report the line number and the error message is slightly different. The test fails as posted, but seems to pass if I s/"id":([0-9])/"id":"item$1"/g.
comment:15 Changed 6 years ago by simon
I believe I've tracked down the problem. When you iterate over an object, you will always get strings. This isn't always a problem because
{1:true}1? == true
However:
[0, 1, 2].indexOf("2") == -1;
This happens at
posB = this.namereg[pkey].items.indexOf(id);
on line 8811 of the single file citeproc-js, in other places in that function, and perhaps other places in the codebase as well. Everywhere you push items onto an array, you need to call toString(). Adding item_id = item_id.toString() to the top of evalname eliminates the "citations is undefined" error, although it might be worth checking to make sure there are no other cases of this.
comment:16 Changed 6 years ago by fbennett
Lots of gratitude here. I came across the problem of type mismatches a year or so ago, and I thought that these cases had all been fixed. Obviously I'll need to take another look. Will think about how to proceed to be sure this is fully squashed.
comment:17 Changed 6 years ago by fbennett
I've put up a new processor version (1.0.105), that incorporates this fix, and forces IDs to string where relevant. Very happy to confirm that this bug disappears when the new processor version is installed in an instance of Zotero 2.1b5.
I thought I was beginning to understand the logic in integration.js, but not so much, apparently. I'd better keep the day job. :)
comment:18 Changed 6 years ago by simon
- Resolution set to fixed
- Status changed from new to closed
Is