Opened 10 years ago
Closed 10 years ago
#87 closed enhancement (fixed)
Add fromArray() and toArray() methods to Item objects
| Reported by: | dstillman | Owned by: | dstillman |
|---|---|---|---|
| Priority: | major | Milestone: | 1.0 Beta 1 |
| Component: | data layer | Version: | 1.0 |
| Keywords: | Cc: | simon |
Description
Convert item to and from multi-dimensional associative arrays -- for use by import/export architecture
Blocking #78
Change History (10)
comment:1 follow-up: ↓ 2 Changed 10 years ago by dstillman
comment:2 in reply to: ↑ 1 Changed 10 years ago by simon
Dan, how do you intend to implement seeAlso? seems to me like it's probably easiest to use IDs, but we'd also need IDs for notes.
also, i will eventually need some kind of function that will return a project along with its subprojects and the IDs of the items in it. i could write it (it's not really any more complicated than #42), but you're the data layer guy, so you get first dibs.
comment:3 Changed 10 years ago by dstillman
comment:4 Changed 10 years ago by dstillman
(In [337]) Addresses #87, Add fromArray() and toArray() methods to Item objects
Changed _getDescendents to take _nested_ flag, be extensible later for smart folders and other types, and include the collection name in the dataset
Added Collection.toArray()
Sample array:
'0' ...
'id' => "13"
'type' => "item"
'1' ...
'id' => "14"
'type' => "item"
'2' ...
'id' => "7373"
'name' => "A Sub-project!"
'type' => "collection"
'children' ...
'0' ...
'id' => "15"
'type' => "item"
'1' ...
'id' => "9233"
'name' => "A Sub-sub-project!"
'type' => "collection"
'children' ...
comment:5 Changed 10 years ago by simon
Dan, shouldn't the tags be an array of strings, rather than tag IDs?
comment:6 Changed 10 years ago by dstillman
comment:7 Changed 10 years ago by simon
(In [364]) closes #78, figure out import/export architecture
closes #100, migrate ingester to Scholar.Translate
closes #88, migrate scrapers away from RDF
closes #9, pull out LC subject heading tags
references #87, add fromArray() and toArray() methods to item objects
API changes:
all translation (import/export/web) now goes through Scholar.Translate
all Scholar-specific functions in scrapers start with "Scholar." rather than the jumbled up piggy bank un-namespaced confusion
scrapers now longer specify items through RDF (the beginning of an item.fromArray()-like function exists in Scholar.Translate.prototype._itemDone())
scrapers can be any combination of import, export, and web (type is the sum of 1/2/4 respectively)
scrapers now contain functions (doImport, doExport, doWeb) rather than loose code
scrapers can call functions in other scrapers or just call the function to translate itself
export accesses items item-by-item, rather than accepting a huge array of items
MARC functions are now in the MARC import translator, and accessed by the web translators
new features:
import now works
rudimentary RDF (unqualified dublin core only), RIS, and MARC import translators are implemented (although they are a little picky with respect to file extensions at the moment)
items appear as they are scraped
MARC import translator pulls out tags, although this seems to slow things down
no icon appears next to a the URL when Scholar hasn't detected metadata, since this seemed somewhat confusing
apologizes for the size of this diff. i figured if i was going to re-write the API, i might as well do it all at once and get everything working right.
comment:8 Changed 10 years ago by dstillman
- Status changed from new to assigned
Simon, do we actually need/want a fromArray() function? I was originally thinking that the import mechanism could just accept (from the translators) the sort of multi-dimensional associative array that toArray() outputs, with the idea that that would make writing translators easier for lay folk, but I'm not sure how much easier it is to build up a proper array than to just use the existing API... It might be easier to immediately grok the array format by just looking at a sample dump, though.
What do you think?
comment:9 Changed 10 years ago by simon
there's an almost-complete version of fromArray() in translate.js already in Scholar.Translate.prototype._itemDone() in translate.js (it doesn't yet handle creator types, notes, or see also, but that shouldn't be too difficult to implement). it's what i've been using for translators since i rewrote the API. i decided the array structure was worth it, because there's something to be said for using the same simple array system for all parts of Scholar.Translate.
the main problem with my implementation of this feature, it seems, is speed. i'm just using the standard data layer functions, but it seems like it takes longer than it ought to, especially when items have a lot of tags. any idea how to speed it up? it's certainly not a showstopper, but if you've got extra time on your hands i'd be happy if you could take a look. feel free to take over (or rewrite, if necessary) my code if necessary.
comment:10 Changed 10 years ago by simon
- Resolution set to fixed
- Status changed from assigned to closed
closing this ticket. there's no work that needs to be done here, since translate.js implements all of fromArray()'s proposed functionality and more.
(In [316]) Addresses #87, Add fromArray() and toArray() methods to Item objects
Item.toArray() implemented -- builds up a multidimensional array of item data, converting all type ids to their textual equivalents -- currently has empty placeholder arrays for tags and seeAlso
Sample source output:
'itemID' => "2"
'itemType' => "book"
'title' => "Computer-Mediated Communication: Human-to-Human Communication Across the Internet"
'dateAdded' => "2006-03-12 05:25:50"
'dateModified' => "2006-03-12 05:25:50"
'publisher' => "Allyn & Bacon Publishers"
'year' => "2002"
'pages' => "347"
'ISBN' => "0-205-32145-3"
'creators' ...
'notes' ...
'tags' ...
'seeAlso' ...
Sample note output:
'itemID' => "17"
'itemType' => "note"
'dateAdded' => "2006-06-27 04:21:16"
'dateModified' => "2006-06-27 04:21:16"
'note' => "text"
'sourceItemID' => "2"
'tags' ...
'seeAlso' ...
sourceItemID won't exist if it's an independent note.
We'll use the same format in reverse for fromArray, so Simon, let me know if you need more data (preserving type ids, etc) or want anything in a different form.