Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
dev:web_api:v3:syncing [2019/03/10 04:22] – [Full-Library Syncing] dstillmandev:web_api:v3:syncing [2019/05/25 06:54] – [Collection/Tag Deletions and Syncing] dstillman
Line 14: Line 14:
   * A version number for each library   * A version number for each library
   * A version number and a boolean ''synced'' flag for each syncable object   * A version number and a boolean ''synced'' flag for each syncable object
 +  * A list of downloaded objects that could not be processed and should be requested explicitly regardless of their remote version number (optional; see [[#Handling save errors|Handling save errors]] for details)
  
 ===== Version Numbers ===== ===== Version Numbers =====
Line 27: Line 28:
 === Last-Modified-Version === === Last-Modified-Version ===
  
-The ''Last-Modified-Version'' response header indicates the current version of either a library (for multi-object requests) or an individual object (for single-object requests). If changes are made to a library in a write request, the library's version number will be increased, any objects modified in the same request will be set to the new version number, and the new version number will be returned in the ''Last-Modified-Version'' header. Since modified objects always receive the newly increased library version, the returned ''Last-Modified-Version'' will be the same whether an item is modified as part of a multi-object or single-object request.+The ''Last-Modified-Version'' response header indicates the current version of either a library (for multi-object requests) or an individual object (for single-object requests). If changes are made to a library in a write request, the library's version number will be increased, any objects modified in the same request will be set to the new version number, and the new version number will be returned in the ''Last-Modified-Version'' header.
  
 === If-Modified-Since-Version === === If-Modified-Since-Version ===
Line 68: Line 69:
  
 Like ''format=keys'', ''format=versions'' is not limited by a maximum number of results and returns all matching objects by default. Like ''format=keys'', ''format=versions'' is not limited by a maximum number of results and returns all matching objects by default.
 +
 +=== Local object versions ===
 +
 +The use of local object versions during syncing, and the process for updating them, is described below.
 +
 +When objects are created or modified locally by the user during regular usage, set ''synced = false'' to indicate that the object needs to be uploaded on the next sync. Give new objects version 0. Do not change the version when objects are modified outside of the sync process.
  
 ===== Full-Library Syncing ===== ===== Full-Library Syncing =====
Line 100: Line 107:
 }</code> }</code>
  
-''/keys/current'' returns information on the API key provided in the ''Zotero-API-Key'' header. Use this response to verify that the key has the expected access to the library you're trying to access.+''/keys/current'' returns information on the API key provided in the ''Zotero-API-Key'' header. Use this response to verify that the key has the expected access to the library you're trying to access. If necessary, show a warning that the user no longer has sufficient access and offer to remove a local library or reset local changes.
  
 ==== 2) Get updated group metadata ==== ==== 2) Get updated group metadata ====
Line 118: Line 125:
 }</code> }</code>
  
-Delete any local groups not in the list. If data has been modified locally in any remotely deleted groups, offer the user the ability to cancel and transfer modified data elsewhere before continuing.+Delete any local groups not in the list, which either were deleted or are currently inaccessible. (The user may have been removed from a group, or the current API key may no longer have access.If data has been modified locally in any groups that are no longer available, offer the user the ability to cancel and transfer modified data elsewhere before continuing.
  
 For each group that doesn't exist locally or that has a different version number, retrieve the group metadata: For each group that doesn't exist locally or that has a different version number, retrieve the group metadata:
Line 139: Line 146:
 Retrieve the versions of all objects changed since the last check for that object type, using the appropriate request for each object type: Retrieve the versions of all objects changed since the last check for that object type, using the appropriate request for each object type:
  
-  GET <userOrGroupPrefix>/collections?since=<last saved library version>&format=versions +  GET <userOrGroupPrefix>/collections?since=<version>&format=versions 
-  GET <userOrGroupPrefix>/searches?since=<last saved library version>&format=versions +  GET <userOrGroupPrefix>/searches?since=<version>&format=versions 
-  GET <userOrGroupPrefix>/items/top?since=<last saved library version>&format=versions&includeTrashed=1 +  GET <userOrGroupPrefix>/items/top?since=<version>&format=versions&includeTrashed=1 
-  GET <userOrGroupPrefix>/items?since=<last saved library version>&format=versions&includeTrashed=1+  GET <userOrGroupPrefix>/items?since=<version>&format=versions&includeTrashed=1
  
-''<last saved library version>'' is the ''[[#last-modified-version|Last-Modified-Version]]'' returned from the API for the last successfully completed sync process, or ''0'' when syncing a library for the first time.+''<version>'' is the final ''[[#last-modified-version|Last-Modified-Version]]'' returned from the API for the last successfully completed sync process, or ''0'' when syncing a library for the first time.
  
 (The ''since'' parameter can also be used on ''.../tags'' requests (without ''format=versions'') by clients that don't download all items and wish to keep a list of all tags in a library up-to-date. It isn't necessary for clients that download all items to request updated tags directly, as item objects contain all associated tags.) (The ''since'' parameter can also be used on ''.../tags'' requests (without ''format=versions'') by clients that don't download all items and wish to keep a list of all tags in a library up-to-date. It isn't necessary for clients that download all items to request updated tags directly, as item objects contain all associated tags.)
  
-The first request — e.g., for collection versions — can also include an ''If-Modified-Since-Version: <last saved library version>'' header. If the API returns ''304 Not Modified'', no library data of any object type has changed since the version specified and no further requests need to be made to retrieve data.+The first request — e.g., for collection versions — can also include an ''If-Modified-Since-Version: <last saved library version>'' header. If the API returns ''304 Not Modified'', no library data of any object type has changed since the version specified and no further requests need to be made to retrieve data unless there are [[#Handling save errors|previously failed objects]] that should be retried
  
 ''200'' response: ''200'' response:
Line 160: Line 167:
 ]</code> ]</code>
  
-For each returned object, compare the version to the local version of the object. If the remote version doesn't match, queue the object for download. Generally all returned objects should have newer version numbers, but there are some situations, such as full syncs (i.e., since=0) or interrupted syncs, where clients may retrieve versions for objects that are already up-to-date locally. The version will also match for top-level items on the second, non-'/top' ''items'' request, since top-level items will have already been processed.+For each returned object, compare the version to the local version of the object. If the remote version doesn't match, queue the object for download. Generally all returned objects should have newer version numbers, but there are some situations, such as full syncs (i.e., ''since=0'') or interrupted syncs, where clients may retrieve versions for objects that are already up-to-date locally. The version will also match for top-level items on the second, non-''/top'' ''items'' request, since top-level items will have already been processed.
  
-Retrieve the queued objects by key, up to 50 at a time, using the appropriate request for each object type:+Retrieve the queued objects, as well as any [[#Handling save errors|flagged]] as having previously failed to save, by key, up to 50 at a time, using the appropriate request for each object type:
  
   GET <userOrGroupPrefix>/collections?collectionKey=<key>,<key>,<key>,<key>   GET <userOrGroupPrefix>/collections?collectionKey=<key>,<key>,<key>,<key>
Line 197: Line 204:
              
  
-If an error occurs while processing an object (e.g., due to a foreign-key constraint in a database), it can be handled one of two ways:+== Conflict resolution ==
  
-  - Treat the error as fatal and stop the sync without updating the local library version +Conflict resolution is complex process not fully described herebut see the Zotero client code for examples.
-  - Mark the object as needing to be downloaded later and continue with the sync, updating the local library version at the end as if the sync had succeeded. In future syncadd objects with this flag to the set of objects returned from the ''versions'' request so that their data is requested again even if the remote version is lower than the library version specified in ''?since=''.+
  
-When processing a set of objects, it may be helpful to maintain an object queue and add failing objects to the end of the queue in case they depend on other objects to succeed. (In some cases, it's also possible to sort objects beforehand to avoid such errors, such as by sorting parent collections before subcollections.)+A few notable features:
  
-When objects are created or modified locally by the user during regular usageset ''synced = false'' to indicate that the object needs to be uploaded on the next sync. Give new objects version 0. Do not change the version when objects are modified outside of the sync process.+  - When an object is successfully downloaded or upload, the Zotero client saves the ''data'' block from the API response as pristine JSON tied to the object version. When a conflict occurs during a syncit can then compare both the local and remote versions of the object to the pristine JSON to determine which changes were made on each side and automatically merge changes that aren't in conflict. Users are prompted to manually resolve only conflicting changes to the same field. 
 +  - The Zotero client automatically resolves conflicts for objects other than items without prompting the user, erring on the side of restoring deleted data. 
 +  - Restoring locally deleted collections is a special case. Item membership is a property of items, so no local items will still be a member of the collection after it's restored, and the local items also may have been deleted along with the collection. When restoring a locally deleted collection, the Zotero client fetches the collection's items from the API and either adds them back to the collection and marks them as unsynced (if they still exist locally) or removes them from the local delete log and flags them for manual download (if they don't). 
 + 
 +== Handling save errors == 
 + 
 +If an error occurs while processing an object (e.g., due to a foreign-key constraint in the local database), it can be handled one of two ways: 
 + 
 +  - Treat the error as fatal and stop the sync without updating the local library version 
 +  - Add the object key to a list of objects needing to be downloaded later and continue with the sync, updating the local library version at the end as if the sync had succeededIn a future sync, add objects on this list to the set of objects returned from the ''versions'' request so that their data is requested again even if the remote version is lower than the library version specified in ''?since=''. Ideally, retry these objects on a backoff schedule, since they may require either a server-side fix or a client update to save successfully. If these objects later appear as deleted, remove them from the list of objects. 
 + 
 +When processing a set of objects, it may be helpful to maintain a process queue for the sync run and move failing objects to the end of the queue in case they depend on other objects being retrieved. (In many cases, it's possible to sort objects beforehand to avoid such errors, such as by sorting parent collections before subcollections.) If a loop of the process queue completes without any objects being successfully processed, stop the sync.
  
 === ii. Get deleted data === === ii. Get deleted data ===
  
   GET <userOrGroupPrefix>/deleted?since=<version>   GET <userOrGroupPrefix>/deleted?since=<version>
 +
 +''<version>'' is, as above, the ''Last-Modified-Version'' returned from the API during the last successful sync run.
  
 Response: Response:
Line 234: Line 253:
 Process the remote deletions: Process the remote deletions:
  
-  for each deleted object in ['collections', 'searches', 'items', 'tags']:+  for each deleted object in ['collections', 'searches', 'items']:
     if local object doesn't exist:     if local object doesn't exist:
       continue       continue
Line 243: Line 262:
     else:     else:
       perform conflict resolution       perform conflict resolution
 +        if user chooses deletion, delete local object, skipping delete log
              
-      if user chooses deletion, delete local object, skipping delete log +        if user chooses local modification, keep object and set synced = true and version = Last-Modified-Version
-       +
-      if user chooses local modification, keep object and set synced = true and version = ''Last-Modified-Version'+
  
-Tags removed from all items are not necessarily deletedhence the separate tag deletion mechanism.+The Zotero client automatically resolves conflicts for objects other than items without prompting the usererring on the side of restoring deleted data.
  
 === iii. Check for concurrent remote updates === === iii. Check for concurrent remote updates ===
  
-For each response from the API, check the ''Last-Modified-Version'' to see if it has changed since the ''Last-Modified-Version'' returned from the first request (i.e., ''collections?since=''). If it has, restart the process of retrieving updated and deleted data, waiting increasing amounts of time between restarts to give the other client the opportunity to finish.+For each response from the API, check the ''Last-Modified-Version'' to see if it has changed since the ''Last-Modified-Version'' returned from the first request (e.g., ''collections?since=''). If it has, restart the process of retrieving updated and deleted data, waiting increasing amounts of time between restarts to give the other client the opportunity to finish.
  
-After saving all remote changes, save ''Last-Modified-Version'' from the last set of requests as the new local library version.+After saving all remote changes without the remote version changing during the process, save ''Last-Modified-Version'' from the last run as the new local library version.
  
 === iv. Upload modified data === === iv. Upload modified data ===
Line 269: Line 287:
 === v. Upload local deletions === === v. Upload local deletions ===
  
-See [[write_requests#deleting_multiple_collections|Deleting Multiple Collections]], [[write_requests#deleting_multiple_searches|Deleting Multiple Searches]], and [[write_requests#deleting_multiple_items|Deleting Multiple Items]]. Pass the current library version as ''If-Unmodified-Since-Version''.+When an object is deleted locally during regular usage, add its library and key to a delete log. When syncing, send delete requests for objects in the log and clear them from the log on successful deletion. When resolving a conflict between a locally deleted object and a remotely modified object in favor of the remote object, remove it from the delete log. 
 + 
 +See [[write_requests#deleting_multiple_collections|Deleting Multiple Collections]], [[write_requests#deleting_multiple_searches|Deleting Multiple Searches]], and [[write_requests#deleting_multiple_items|Deleting Multiple Items]] for the specific requests. Pass the current library version as ''If-Unmodified-Since-Version''.
  
 Example request: Example request:
Line 284: Line 304:
  
 On a ''412 Precondition Failed'' response, return to the beginning of the sync process for that library. On a ''412 Precondition Failed'' response, return to the beginning of the sync process for that library.
- 
  
 ===== Partial-Library Syncing ===== ===== Partial-Library Syncing =====
Line 309: Line 328:
  
 Note that multi-object endpoints should always be used for large operations. Using single-object endpoints excessively could result in throttling by the server. Note that multi-object endpoints should always be used for large operations. Using single-object endpoints excessively could result in throttling by the server.
- 
-===== Collection/Tag Deletions and Syncing ===== 
- 
-A collection or tag deletion will cause all associated items to be updated on the server, and the updated items will be set to the library version returned by the deletion request. This interaction between object types can result in sync conflicts if clients don't take special precautions when performing these actions. 
- 
-Clients have two options for performing collection and tag deletions: 
- 
-==== Re-upload Items and Delete Collection/Tag ==== 
- 
-This method is appropriate for clients that sync the entire library. 
- 
-When deleting a collection/tag locally, mark previously associated items as changed. Before sending the collection/tag DELETE request, upload the modified items to the server. Once those changes have been uploaded, the DELETE for the collection/tag can be sent. Since the collection/tag on the server will have no associated items, there is no potential for a conflict between local and remote items. 
- 
-==== Delete Collection/Tag and Redownload Items ==== 
- 
-This method is appropriate for clients that will not necessarily have all items associated with the collection/tag locally or that expect to have significantly more limited upload bandwidth. 
- 
-When deleting a collection/tag locally, the client should not mark previously associated items as changed to avoid triggering conflicts when the items updated on the server are redownloaded. 
- 
-However, a conflict can still occur if an associated item is modified locally in other ways and not synced to the server before the collection/tag deletion is uploaded. When the client tries to pull down the updated remote item after the collection/tag deletion, the local version will be marked as changed, and since the data won't match, the client will need to perform conflict resolution. 
- 
-To avoid this, clients can store a pristine copy of the item data (not counting collections and tags) before modifying an item locally. This will allow the client to determine what local and remote changes have been made since the item was last downloaded. 
- 
-Then, when a conflict occurs, if the server's item data matches the pristine copy and the server collections/tags match the current local collections/tags, clients can just upload the local item data changes. 
- 
-If the server item data doesn't match the pristine copy, the client can attempt to apply both local and remote changes and perform a conflict resolution only if the same field has been modified. 
- 
-If the server collections/tags don't match the current local collections/tags, the client will need to either perform conflict resolution or automatically merge the collections and tags, restoring any deleted ones. 
- 
- 
  
  
dev/web_api/v3/syncing.txt · Last modified: 2022/08/14 05:34 by dstillman