Opened 10 years ago

Closed 10 years ago

#87 closed enhancement (fixed)

Add fromArray() and toArray() methods to Item objects

Reported by: dstillman Owned by: dstillman
Priority: major Milestone: 1.0 Beta 1
Component: data layer Version: 1.0
Keywords: Cc: simon

Description

Convert item to and from multi-dimensional associative arrays -- for use by import/export architecture

Blocking #78

Change History (10)

comment:1 follow-up: Changed 10 years ago by dstillman

(In [316]) Addresses #87, Add fromArray() and toArray() methods to Item objects

Item.toArray() implemented -- builds up a multidimensional array of item data, converting all type ids to their textual equivalents -- currently has empty placeholder arrays for tags and seeAlso

Sample source output:

'itemID' => "2"
'itemType' => "book"
'title' => "Computer-Mediated Communication: Human-to-Human Communication Across the Internet"
'dateAdded' => "2006-03-12 05:25:50"
'dateModified' => "2006-03-12 05:25:50"
'publisher' => "Allyn & Bacon Publishers"
'year' => "2002"
'pages' => "347"
'ISBN' => "0-205-32145-3"
'creators' ...

'0' ...

'firstName' => "Susan B."
'lastName' => "Barnes"
'creatorType' => "author"

'notes' ...

'0' ...

'note' => "text"
'tags' ...
'seeAlso' ...

'1' ...

'note' => "text"
'tags' ...
'seeAlso' ...

'tags' ...
'seeAlso' ...

Sample note output:

'itemID' => "17"
'itemType' => "note"
'dateAdded' => "2006-06-27 04:21:16"
'dateModified' => "2006-06-27 04:21:16"
'note' => "text"
'sourceItemID' => "2"
'tags' ...
'seeAlso' ...

sourceItemID won't exist if it's an independent note.

We'll use the same format in reverse for fromArray, so Simon, let me know if you need more data (preserving type ids, etc) or want anything in a different form.

comment:2 in reply to: ↑ 1 Changed 10 years ago by simon

Dan, how do you intend to implement seeAlso? seems to me like it's probably easiest to use IDs, but we'd also need IDs for notes.

also, i will eventually need some kind of function that will return a project along with its subprojects and the IDs of the items in it. i could write it (it's not really any more complicated than #42), but you're the data layer guy, so you get first dibs.

comment:3 Changed 10 years ago by dstillman

(In [336]) Addresses #87, Add fromArray() and toArray() methods to Item objects

toArray() improvements:

  • seeAlso support (array of itemIDs)
  • Added itemID to source notes
  • Fixed bug in creator handling

comment:4 Changed 10 years ago by dstillman

(In [337]) Addresses #87, Add fromArray() and toArray() methods to Item objects

Changed _getDescendents to take _nested_ flag, be extensible later for smart folders and other types, and include the collection name in the dataset

Added Collection.toArray()

Sample array:

'0' ...

'id' => "13"
'type' => "item"

'1' ...

'id' => "14"
'type' => "item"

'2' ...

'id' => "7373"
'name' => "A Sub-project!"
'type' => "collection"
'children' ...

'0' ...

'id' => "15"
'type' => "item"

'1' ...

'id' => "9233"
'name' => "A Sub-sub-project!"
'type' => "collection"
'children' ...

comment:5 Changed 10 years ago by simon

Dan, shouldn't the tags be an array of strings, rather than tag IDs?

comment:6 Changed 10 years ago by dstillman

(In [359]) Addresses #87, Add fromArray() and toArray() methods to Item objects

Item.getTags() (which toArray() uses) now returns actual tags rather than ids -- separate method getTagIDs to return ids

comment:7 Changed 10 years ago by simon

(In [364]) closes #78, figure out import/export architecture
closes #100, migrate ingester to Scholar.Translate
closes #88, migrate scrapers away from RDF
closes #9, pull out LC subject heading tags
references #87, add fromArray() and toArray() methods to item objects

API changes:
all translation (import/export/web) now goes through Scholar.Translate
all Scholar-specific functions in scrapers start with "Scholar." rather than the jumbled up piggy bank un-namespaced confusion
scrapers now longer specify items through RDF (the beginning of an item.fromArray()-like function exists in Scholar.Translate.prototype._itemDone())
scrapers can be any combination of import, export, and web (type is the sum of 1/2/4 respectively)
scrapers now contain functions (doImport, doExport, doWeb) rather than loose code
scrapers can call functions in other scrapers or just call the function to translate itself
export accesses items item-by-item, rather than accepting a huge array of items
MARC functions are now in the MARC import translator, and accessed by the web translators

new features:
import now works
rudimentary RDF (unqualified dublin core only), RIS, and MARC import translators are implemented (although they are a little picky with respect to file extensions at the moment)
items appear as they are scraped
MARC import translator pulls out tags, although this seems to slow things down
no icon appears next to a the URL when Scholar hasn't detected metadata, since this seemed somewhat confusing

apologizes for the size of this diff. i figured if i was going to re-write the API, i might as well do it all at once and get everything working right.

comment:8 Changed 10 years ago by dstillman

  • Status changed from new to assigned

Simon, do we actually need/want a fromArray() function? I was originally thinking that the import mechanism could just accept (from the translators) the sort of multi-dimensional associative array that toArray() outputs, with the idea that that would make writing translators easier for lay folk, but I'm not sure how much easier it is to build up a proper array than to just use the existing API... It might be easier to immediately grok the array format by just looking at a sample dump, though.

What do you think?

comment:9 Changed 10 years ago by simon

there's an almost-complete version of fromArray() in translate.js already in Scholar.Translate.prototype._itemDone() in translate.js (it doesn't yet handle creator types, notes, or see also, but that shouldn't be too difficult to implement). it's what i've been using for translators since i rewrote the API. i decided the array structure was worth it, because there's something to be said for using the same simple array system for all parts of Scholar.Translate.

the main problem with my implementation of this feature, it seems, is speed. i'm just using the standard data layer functions, but it seems like it takes longer than it ought to, especially when items have a lot of tags. any idea how to speed it up? it's certainly not a showstopper, but if you've got extra time on your hands i'd be happy if you could take a look. feel free to take over (or rewrite, if necessary) my code if necessary.

comment:10 Changed 10 years ago by simon

  • Resolution set to fixed
  • Status changed from assigned to closed

closing this ticket. there's no work that needs to be done here, since translate.js implements all of fromArray()'s proposed functionality and more.

Note: See TracTickets for help on using tickets.