IRC log for #koha, 2006-02-19

← Previous day | Today | Next day → | Search | Index

All times shown according to UTC.

Time Nick Message
11:00 kados one, connection management like dbh
11:00 two, a single ZOOM::Package method on Koha that we can pass options too for handling all extended services in zebra
11:00 (because there are not very many of them)
11:02 paul not a bad idea, but if mike did not write it, maybe it means it's not that useful ?
11:03 kados sub zebra_package {
11:03        my ($Zconn,$options,$operation,$biblionumber,$record)
11:03 ;
11:04 $Zconn contains the connection object
11:04 $options contains a hash like:
11:04 action => "specialUpdate"
11:04 record => $xmlrecord,
11:04 etc.
11:05 $operation is what we put in the $p->send($operation)
11:05 (create, commit, createdb, etc.)
11:05 $biblionumber and $record are obvious :-)
11:05 paul: does that make sense?
11:05 paul yes, that does.
11:06 & i think it's a good suggestion.
11:06 kados I'm still gun shy with my programming :-)
11:06 OK ... great, I will add a new sub zebra_package
11:19 paul: in fact, could 'biblionumber' and $record' be contained within $options?
11:20 paul I'm not strongly for putting everything in a hash in parameters.
11:20 it usually makes the API simpler at 1st glance, but it's less clear.
11:20 thus, I would put in a hash all what you can't be sure you'll have at every call.
11:20 and only this.
11:21 kados I'm not sure every call has a biblionumber and a record
11:21 for instance, a 'delete' would only have a biblionumber
11:22 createdb would have neither
11:23 paul thus i'm not sure having only 1 sub is a good solution.
11:24 maybe a package with some subs, each specialized, could be better.
11:25 kados I see
11:25 hmmm
11:26 paul kados : nothing to answer to my mail to koha-devel subject : "testing tool ?" and "yahoo tool" ?
11:26 kados I looked at them
11:27 I think they could be very useful
11:27 we need to have a meeting soon about template design
11:27 to discuss katipo's plans for the new default templates
11:33 paul OK. will be here next week (monday 9PM for me)
12:05 thd paul: It is good that Yahoo is sharing so as not to be left behind by Google.
12:07 paul: are you still there?
12:07 paul yep
12:07 (4pm in france)
12:08 thd kados: Had been curious about whether any large vendors known in the US market library systems in France.
12:09 s/kados:/paul: kados/
12:10 paul ?
12:11 thd paul: Do Sirsi/Dynix , Innovative Interfaces etc. have systems marketed in France?
12:11 paul of course
12:11 with the same name as in US I think
12:13 thd paul: All of them and do they all use Recommendation 995 for the software that they market there?
12:13 paul in fact, their market don't need reco 995. I explain
12:13 in France, all major libraries (including ALL universities) MUST use SUDOC
12:13 (www.sudoc.abes.fr)
12:14 they use sudoc tool (a proprietary sofware)
12:14 to catalogate theirs biblios in the sudoc centralized DB.
12:14 sudoc unimarc don't use 995 for items.
12:14 it uses 906/907/908 iirc.
12:15 then, every night, the sudoc ftp the new biblios on the library server
12:15 for integration into local ILS.
12:15 NO cataloguing is done in local ILS
12:15 (except for books outside the sudoc, but there should be none)
12:15 thus, every vendor has a "moulinette" to do SUDOC => local ILS
12:16 no need to support 995 or even sudoc unimarc.
12:16 .
12:16 thd paul: Do you have a link to the 906/907/908 standard that SUDOC uses?
12:18 kados paul: wow ... that is very efficient!
12:18 paul kados joking today ?
12:18 kados paul: France only catalogs items ONCE
12:19 paul except that the system has many critics :
12:19 kados ahh
12:19 paul * the libraries are paid when they create a biblio, but paid when they just localize an item in an existing biblio. Everybody pays at the end.
12:19 * the tool to catalogate is quite outdated.
12:20 * you must wait at least 1 day to see an item in your catalogue
12:20 kados ahh
12:20 paul: I have a quick question
12:20 paul in fact, ppl want now a tool to upload their biblios from their local ILS to sudoc.
12:20 but sudoc refuses for instance.
12:20 thd : no, no link
12:20 kados zebra allows updates to be performed with:
12:20 record, recordIdNumber
12:20 (sysno) and/or recordIdOpaque(user supplied record Id). If both
12:21 IDs are omitted internal record ID match is assumed
12:21 right now, we use internal record ID to match
12:21 do you anticipate us ever using recordIdOpaque or recordIdNumber for future kohas?
12:21 (maybe when we are using more than one record syntax?)
12:24 thd kados: do I understand correctly that the system is not now using a preset explicit record ID?
12:24 kados hmmm
12:24 I think we are currently just using the 090$c field in MARC21 $a in UNIMARC
12:25 paul: correct me if I'm wrong
12:25 thd paul: should we not use 001 now finally?
12:25 kados thd: is that the standard ?
12:25 thd yes
12:25 kados thd: (for MARC identifiers)
12:25 thd yes
12:25 kados thd: for UNIMARC and USMARC?
12:25 thd all MARC
12:26 kados that sounds reasonable
12:26 paul kados & thd : since Koha 2.2.3 (iirc), you can put biblionumber in 001 without harm
12:26 thd kados: Furthermore it works very well with authorities
12:26 paul in fact that's what I did for IPT
12:27 what had to be done was removing biblionumber/biblioitemnumber in same MARC field constraint that existed previously
12:27 I did it, so you can put biblionumber in 001 if you want.
12:27 but if you import biblios from an external source, you may want to let original 001 and put biblionumber anywhere else
12:28 thd paul there is a place already for the original system number.
12:29 kados is that why zebra distinguishes between recordIdOpaque and recordIdNumber?
12:29 paul yes, in unimarc also
12:29 kados I'm not sure when these are different
12:32 thd paul: MARC 21 035 is a repeatable field for Original System Number
12:34 paul: It is identical in UNIMARC
12:34 hello dce
12:34 dce htd: I have an answer to your question.  Follett book pricing is pretty consistent with most distributors I'm told
12:35 thd dce: I would imagine that it is.  What does that mean in relation to the publisher's list price?
12:37 dce No clue.  If you really want to know I can ask though.
12:40 thd dce: I have wanted to know the answer to both the question you answered before and that one for a long time but had never asked the right person.
12:42 dce: I assume your answer was US$ for 0.26, was it not?
12:44 kados paul: The SUDOC approach is similar to how OCLC works except that OCLC is supposed to be a cooperative owned by the member libraries.
12:44 paul (on phone)
12:46 kados thd: right
12:47 thd paul: SUDOC is totally private not a cooperative member organisation in theory?
12:50 paul: OCLC behaves as though it were a greedy private company even though it is actually a non-profit or not for profit membership cooperative.
12:52 dce thd: nod price was US$.  I'll ask about pricing compared to list and let you know
12:54 thd dce: Thank you.  Whenever I ask industry people who know the market as a whole they have told me that they are not allowed to say.
12:58 paul: Are you still on phone?
12:59 paul (yes)
12:59 thd paul: Let me know when you are back?
13:15 paul back
13:16 thd : you're right : SUDOC is something strange, private in fact.
13:16 thd paul: I am interested in behaviour where a record can added to the system, removed from the system for external manipulation if ever needed, and then added back to the system all the while preserving the same number that the system uses to track the record.  The system would not then increment a new number if it was an old record.
13:16 paul this is impossible in Koha for instance.
13:17 thd paul: Impossible now but what about in 3.0
13:17 ?
13:18 paul nothing planned on this subject.
13:18 kados thd: so currently, everytime a record is edited its id changes?
13:19 thd: or are you talking about something other than editing?
13:20 thd kados paul: Most library systems do this.
13:20 kados thd: i don't understand the functionality ...
13:21 thd: are you implying that when a record is edited or its status changed the record's ID increments?
13:21 thd: what is a practical use case for your desired behaviour?
13:23 thd kados: You can export a set of records and send them to OCLC or wherever for enhancement character set changes etc and then re-import them using the same record number.  It would be better if all systems could do this internally but they do not presently.
13:23 kados thd: it should be possible with ZOOM
13:23 you can specify both a user/client-defined ID and a system ID
13:23 as well as just pull the ID out of the MARC record
13:24 thd kados: Not if Koha does not support it.
13:24 kados thd: the subroutine I'm building to handle extended services will support this feature
13:24 as well as zebra's ILL :-)
13:25 I can't find a case scenerio for the client-defined ID or the system ID
13:25 thd: do you know when these would be used?
13:25 http://www.indexdata.dk/zebra/NEWS
13:25 search for 'recordIdOpaque' on that page
13:25 thd kados: Also, item identification numbers do not change when importing and exporting the same record on standard systems.
13:26 kados it's the only reference that explains those IDs that I can find
13:26 thd: right ... and they don't with Zebra either
13:26 thd: won't in Koha 3.0
13:27 thd: do you have any ideas for when we would use recordIdOpaque and when we would use recordIdNumber?
13:29 thd kados: user supplied IDs would be useful to preserve the 001 for its intended use between systems when migrating to Koha.
13:30 kados thd: right, I wonder if that's what they are intended for
13:30 thd kados: As well as the scenarios that I described above.
13:30 kados perhaps I should ask the Index Data guys
13:31 thd kados: Better to ask than to find out later there is another preferred purpose or means for what you need.
13:31 kados yep
13:34 thd paul: Do any of your customers catalogue with SUDOC now?
13:34 paul nope
13:35 thd paul: Would you have the potential to acquire those customers if you had a conversion for their 906/907/908?
13:35 paul me probably not. But ineo, yes, for sure.
13:36 thd paul: you are not ambitious? :)
13:36 paul and pierrick begins it's work in Ineo in 2 weeks. and it's 1st goal will be to write a "moulinette" to do things like this.
13:36 it's just that :
13:37 * I know the limit of my 2 men large company
13:37 * I don't want to become larger (even with pills ;-) )
13:37 * Ineo want to work on this market, and also want me as a leader with them.
13:37 so, no reason to be more ambitious than this ;-)
13:38 thd paul: You have an intern as well do you not?
13:38 paul an intern ???
13:46 kados now ... some breakfast :-)
13:49 paul now some week end ;-)
13:56 kados thd: I wrote a mail to koha-zebra
13:56 thd: asking some questions I had about ZOOM::Package
13:56 thd paul_away: sorry I had timed out without realising
13:57 paul_away: maybe I misread a #koha log.  An intern is a student or recent graduate usually who works on different circumstance than a regular employee and usually for a limited time.
13:59 paul_away: I will find you next week.  Have a pleasant weekend.
14:01 paul still here in fact.
14:01 right thd, but the student will be here only for 2 months.
14:01 maybe 2+2 in the summer.
14:01 so I don't count him.
14:02 thd paul: Google's scheme for scanning the pages of library books has interns doing most of the labour.
14:03 paul: You could have an army of interns :)
14:04 paul: I have met French interns in the US who fulfil their national service requirement by working at a foreign branch of a French company.
14:04 paul yes, but that was before our president decided to go for professionnal army => no national service (since 1997 iirc)
14:05 thd paul: In Anglophone countries interns are low paid or unpaid.  They are there to obtain the experience.
14:06 paul same in France.
14:06 (they CAN be paid up to 30% of SMIC -the minimum income for anyone in france-)
14:06 thd paul: Does SUDOC maintain a proprietary interest in their UNIMARC records?
14:06 paul if I don't mind, yes.
14:07 ok, this time, i really must leave ;-)
14:07 kados bye paul
14:07 have a nice weekend
14:07 thd: http://search.cpan.org/~mirk/N[…]ZOOM%3A%3APackage
14:07 thd see you next week paul
14:09 kados thd: check out the options for updating there
14:10 thd: I think that will answer your questions earlier
14:10 thd kados: One of my questions for Mike was about "Extended services packages are not currently described in the ZOOM Abstract API at http://zoom.z3950.org/api/zoom-current.html They will be added in a forthcoming version, and will function much as those implemented in this module."
14:11 kados right
14:11 Mike is on the committee
14:11 currently, ZOOM is read-only
14:11 :-)
14:11 'official ZOOM that is'
14:12 but since Mike is on the committee ... and Seb is influential in Z3950 as well
14:12 I'm sure they will adopt it
14:12 thd kados: This option is nice "xmlupdate
14:12    I have no idea what this does.
14:12 "
14:12 kados yea, one of my questions to koha-zebra :-)
14:13 this is what I thought you would like:
14:13 "the action option may be set to any of recordInsert (add a new record, failing if that record already exists), recordDelete (delete a record, failing if it is not in the database). recordReplace (replace a record, failing if an old version is not already present) or specialUpdate (add a record, replacing any existing version that may be present)."
14:15 thd kados: That is exactly what I was hoping.  Now Koha needs to support that as well.
14:17 kados I"m looking at Biblio.pm right now :-)
14:17 thd kados: Koha also needs to manage items in such a way that item ID is persistent.
14:17 kados right
14:17 right not it's not?
14:17 what is a case scenerio where item ID needs to be persistent?
14:18 (ie, why can't we just handle that with different statuses?)
14:18 thd s/persistent/persistent across imports and exports/
14:19 kados imports/exports ... hmmm
14:20 thd kados: you need a standard means of linking MARC holdings records with whatever the Koha DB is doing with holdings.
14:20 kados I see
14:20 will barcode work?
14:21 thd kados: Are you asking if barcodes would work as the persistent ID?
14:21 kados yes
14:23 thd kados: Maybe but an item needs an ID before a barcode has been assigned or even in the absence of a barcode for material that is not tracked with barcodes.
14:23 kados hmmm
14:23 itemnumber then
14:23 that will work right?
14:25 thd kados: some sort of itemnumber must work.  There needs to be a means to protect against its automatic reassignment when exporting and re-importing records.
14:26 kados: Also, it would seem advantageous to preserve item numbers between systems when migrating.
14:27 kados right
14:27 so I don't think zebra's equipped to handle ids at the item level
14:28 so what you're saying is:
14:28 we need to be able to export and import records and items
14:28 without affecting our management of them in Koha
14:28 ie, we don't want to delete a record every time we export it or import a new version
14:28 same with items
14:28 right?
14:28 thd kados: MARC uses a copy number concept but there can be items at the sub-copy level.  A serial title may have only one copy but many constituent items.
14:29 exactly
14:30 kados interesting
14:30 so we now have
14:30 record
14:30 copy
14:30 item
14:30 looks a lot like
14:30 biblio
14:30 biblioitem
14:30 item
14:30 :-)
14:32 I think zebra can handle everything we need to do at the record level
14:32 ie, that paragraph I posted above
14:32 so now all we need to do
14:33 is make sure copy-level and item-level import/export doesn't remove the item's id
14:33 can't we simply store the id somewhere in the record?
14:33 thd kados: yet if you export a record with fewer items and re-import with more items Koha should preserve the old item numbers and add item numbers for what needs them.
14:33 kados ie, can't we base our 9XX local use fields based on that model?
14:33 with record-level data, copy-level data, and item-level data?
14:34 or do the frameworks in koha not support this?
14:34 so Koha just needs to pay attention to certain item-level and copy-level fields before assigning item numbers
14:34 that shouhld be literally a 3 line change!
14:35 thd kados: you can do anything with MARC and the frameworks allow you to create whatever you need except that they need better support for indicators, fixed fields, etc.
14:36 kados: Remember the problem about breaking the MARC size limit when there are too many items in one record.
14:38 kados: Libraries that need to track every issue of a serial forever need multiple interlinked holdings records in MARC.
14:39 kados how was that acomplished?
14:39 thd kados: Koha could integrate everything in your grand non-MARC design for a holdings record but then you need to be able to export into MARC.
14:40 kados which wouldn't be hard once we had the functionality built in
14:40 it's the first step that's hard :-)
14:41 I don't understand copy-level vs record-level
14:41 thd kados: There are various linking points in MARC holdings records and various means for identifying the part/whole relationships.
14:41 kados thd: should I read the cataloger's reference shelf?
14:41 thd kados: Do you mean copy level vs. item level?
14:42 kados well ... I'm not really sure what the difference between the three are in terms of MARC holdings
14:42 I understand koha's heirarchy, just not the MARC holdings one
14:44 thd kados: Records have an arbitrary level that is distinguished by information referring to other records which may be textual information that is difficult for a machine to parse or maybe more machine readable and difficult for humans to parse and program.
14:45 kados: Machine readable can be difficult to program because human programmers have to parse it correctly first :)
14:46 kados right :-)
14:46 I'm reading the introduction to holdings now
14:46 OK ... right off the bat I can tell we're going to need leader management
14:47 thd kados: copies can be at whatever level the cataloguer wants to distinguish a copy.
14:47 kados the leader determines the type of record we're dealing with
14:47 whether single-part
14:47 multi-part
14:47 or serial
14:48 yikes, they're putting addresses in the 852!
14:48 that's really bad practice
14:49 so every time a library changes its address you need to update all the records?
14:49 or if an item is transfered to another address you need to update the address for that item?
14:49 that's absurd
14:49 thd kados: yes I lapsed a day or so ago I think and wrote 006 for 000/06 000/07.
14:50 kados http://www.itsmarc.com/crs/hold0657.htm
14:50 thd kados: remember this was designed in the mid sixties.
14:50 kados so 004 is used for linking records
14:51 thd kados: Do not expect libraries to start with MARC holdings records.  Mostly they will have bibliographic records with holdings fields included.
14:52 kados: The largest libraries with the best up to date library systems will have MARC holdings records.
14:53 kados An organization receiving a separate holdings record may move the control number for the related bibliographic record of the distributing system from field 004 to one of the following 0XX control number fields and place its own related bibliographic record control number in field 004:
14:53 what is the normal form of that control number?
14:53 ie, say we have a parent record
14:54 and three children
14:54 the parent record doesn't have a 004
14:54 and we need to 'order' the three children, right?
14:54 ie, first, second, third
14:54 does their 004 field contain whatever is in the parent 001?
14:55 thd kados: 001 from the related record is used.
14:56 kados so there is no way to order them
14:56 ok ... next question
14:56 so we get a record from oclc
14:56 we grab their 001 and put it in 014
14:56 then, say we want to check to see if that record is up to date
14:56 thd kados:  why is there no way to order them?
14:57 kados we query oclc for the number in 014 (in their 001) to find if there is an updated record?
14:57 is that the reason we preserve the incoming 001?
14:58 thd kados: I should immediately send you my draft email from months ago.
14:59 kados: It is all instantly relevant.
14:59 kados please do
15:02 thd kados: This weekend certainly, but to answer you want to preserve numbers from a foreign system, usually in 035 for bibliographic records to be able to refer back to the original record for updating if the original foreign record is changed or enhanced in some manner.
15:03 kados: If you have the original number, then perhaps the system can automatically check for an update on the foreign system.
15:05 kados: Existing systems do not actually do this to my knowledge but a human can query the original system with the unique record number from that system.
15:06 kados: Cataloguers have no time for this now but that is the purpose and time would not be a factor in an automated system.
15:07 kados thd: on the holdings page it says to preserve it in 014
15:07 thd or 035.  There options.
15:09 kados: sorry, you are right for the linkage number.  001 goes in 035.
15:11 kados: Although, I suspect 014 may be more recently specified than 035.
15:12 kados: Both are repeatable and could be filled with numbers from multiple systems.
15:15 kados right
15:16 thd kados: The utility of preserving holdings numbers from foreign systems would be primarily for libraries that catalogued on OCLC or some other remote system.  Or even added their own records to OCLC which SUDOC disallows.
15:21 kados: to continue about records : copies : items
15:22 kados: a record may have one or many copies associated.
15:23 kados: copies may be distinguished on any level at which the record is specified.
15:24 kados what does that last point mean?
15:24 thd kados: Within a copy level there may be subsidiary items.
15:25 kados: Provided the items are in the same record as the copy number.
15:26 kados I usually need examples rather than abstract descriptions :-)
15:27 thd kados: Copies are distinguished by number.  Yet a copy may be for a serial title and individual items may be for issues or bound years etc.
15:28 kados: Imagine a serial title where there are 3 copies.
15:29 kados meaning there have been three issues?
15:29 Vol. 1 No. 1, No. 2, No. 3 ?
15:29 dce thd: Here is the reply I got about pricing: "I think it depends.  Some vendors are cheaper for certain things.  Sara says she likes to order from Baker & Taylor because they have a better price.  Most of the school librarians I know do business with Follett--most of the Public Librarians I know do business with Baker & Taylor."
15:29 thd kados: 3 copies of the whole title for several years.
15:29 kados if a copy isn't an issue, what is it?
15:33 thd dce: Does Sara have info on the degree of discount Baker and Taylor gives to libraries and how much for Follett in relation to list price?  What things are liable to have better prices than others and what is the degree of variance?  This is something I have wondered about for years and I have worked for over 15 years in the retail book trade so I could tell you all about those discounts.
15:34 kados thd: in order to replicate MARC holdings support in Koha we need two things:
15:34 1. a list of elements of MARC holdings
15:34 thd kados: a copy can be at any level including the entire run of a serial for many years as a single copy.
15:34 kados 2. a list of behaviours of those elements
15:35 and these lists should be short laundry lists
15:35 bare minimum needed to explain what they are and what they do
15:35 the third thing we will need
15:35 is a mapping between the list and MARC fields
15:35 thd: follow me?
15:35 thd yes
15:36 kados thd: so, I'd be happy to chat with you for the next few hours if we could compile such lists
15:36 I fear we're going to get lost in the forest without any clear objective :-)
15:36 because the standard is quite large :-)
15:37 thd kados: that list would be as long as the specification when it comes to the tricky parts.
15:37 kados well ... can we abstract out a subset that will cover 99% of cases?
15:38 thd: how do you recommend specing out full support for MARC holdings in Koha?
15:38 thd kados: you need to have a bidirectional mapping for standards compliance
15:38 kados thd: of course
15:39 and fortunately for us, zebra includes support for .map files so that shouldn't be too hard to do
15:39 thd kados: and encompassing a one to one mapping to the extent that nothing should be lost in conversion.
15:39 kados we just need a kohamarc2standardmarc.map
15:39 right
15:39 it's definitely possible
15:40 we just need to invest the time into writing up the lists and how they interact
15:40 and write that into our framework
15:40 as well as our import/export routines
15:40 thd kados: certainly possible and certainly you can devise a better flexible structure for managing the data than MARC.
15:40 kados yep
15:42 thd kados: but if there was a high degree of translation need between MARC and Koha in real time the system would become bogged down under a heavy load with a large number of records.
15:43 kados: The presumption must be that real time translation will be only for the few records currently being worked on.
15:45 kados thd: exactly
15:45 thd kados: Translation is CPU intensive.  Your better record system needs a performance gain exceeding the overhead of translation.
15:47 kados: MARC was designed in the days when almost everything ran as a batch process.  People had very modest real time expectations.
15:49 kados: shall I continue with the record : copy : Item distinction?  If you miss that you miss the most fundamental element.
15:50 kados yes please do
15:50 but lets start making those lists
15:50 as soon as you're done
15:50 so we can then put things into practical terms
15:52 thd ok
15:53 kados: a single copy can be for a whole run of many years of a serial title containing many issues within just one copy.
15:56 kados ok ... strange choice of terms but I think I get it
15:56 thd kados: A copy can be assigned at an arbitrary point.
15:57 kados I see
15:57 thd kados: So imagine 3 copies in a single record for the whole run of a serial
15:58 kados would there be 3 because the MARC would run out of room for one?
15:58 thd kados: copy 1 is a  printed copy of loose issues in boxes
15:58 kados or for comletely arbitrary reasons?
15:59 ok
15:59 thd kados: We will presume everything fits for my example but that is practical problem that needs to be addressed for creating MARC records when the threshold is reached
16:00 kados ok
16:01 thd kados: copy 2 has each year bound separately.
16:03 kados: copy 3 is an a variety of full text databases providing coverage but they might have been linked to the hard copy.
16:04 kados: copy 1 designates individual issues as items but without a separate copy number.
16:05 kados: copy 3 designates individual years as items but again without a separate copy number.
16:06 kados: copy 3 is a jumbled mess because that is the world of agreements that often cover electronic database rights :)
16:07 kados: Individual items may or may not have separate barcodes.
16:07 kados:  Just for fun we may imagine that all items do.
16:07 kados hmmm
16:07 copies 1-3 cover the same years?
16:08 thd kados: wait the fun has only just begun
16:09 kados: the electronic database coverage is unlikely to be identical for full text unless the hard copies are relatively recent and even then gaps should be expected.
16:09 kados question:
16:09 copy 1 designates individual issues as items but without a
16:09             separate copy number
16:09 thd kados: If you are lucky your vendor will tell you about all the gaps.
16:09 kados what does that mean?
16:11 what do you mean that it doesn't have a separate copy number?
16:11 why would it?
16:11 thd kados: In our record example MARC using $6 distinguishes copy numbers but can distinguish items at a lower level using $8 if I remember but we can check later.
16:12 kados I literally don't understand what it would mean to have a _separate_ copy number
16:12 it is in copy 1 right?
16:12 so 1 is the copy number
16:13 thd kados: yes all in copy 1
16:13 that is the copy number
16:13 kados so this sounds quite a lot to me like biblio,biblioitems,and items
16:14 at least where holdings is concerned
16:14 thd kados: except that it is arbitrary
16:14 kados or do we sometimes need more than three levels?
16:14 in our tree?
16:14 thd kados: Serials can be very complex and may use many levels
16:15 kados ok
16:15 so we need a nested set then
16:15 to handle holdings
16:15 thd kados: yes
16:16 kados do we need to do more than just map child-parent relationships?
16:16 thd kados: MARC has theoretical limits within one record.  Koha can do better for other types of data.
16:16 kados right
16:16 thd kados: siblings
16:16 kados relationships between siblings? more complex than ordering?
16:18 so we have:
16:18 one record-level ID which cooresponds to MARC 001
16:18 an arbitrary number of copy-level IDs which coorespond to which MARC field?
16:19 and an arbitrary number of item-level IDs which map to owhich MARC field?
16:19 welcome joshn_454
16:20 thd: then, we have relationships ... child-parent in particular
16:20 thd kados: a publication with a bound volume and a CD-ROM could be all described in one record or could have separate sibling records linked together.
16:20 kados thd: would they be linked to each other?
16:20 ie, each one of them would refer to the other?
16:20 or would they both refer to the parent?
16:20 thd kados: I have seen reference to sibling linking.
16:21 kados thd: sibling linking could become quite complex
16:21 thd: because they we have exceeded the abilities of a nested set
16:22 thd kados: Plan on finding cataloguing records in existing systems that require system interpretation to uncover the true relationships that you would want to map in Koha.
16:22 kados thd: I can track an arbitrary hierarchy with a nested set and even do sibling ordering
16:22 so for example
16:22 grandfather
16:22 father uncle
16:22 child1 child2 child3
16:22 where child1 is older than child2, etc.
16:23 and it would be easy to query to find who your siblings are
16:23 but I'm assuming we dont' want to do that every time we pull up a record :-)
16:23 (and father is older than uncle in the above case)
16:24 thd: your thoughts?
16:24 thd kados: They can always be mapped but you want to put them in a scheme that does a better job of making the relationships explicit than the cataloguer may have done.
16:25 kados I think that approach could be dangerous
16:25 ie if the relationships aren't pre-defined
16:25 I don't want to interpret what they should be
16:25 that's the realm of FRBR
16:25 thd kados: existing cataloguing records in the real world are often a mess of inconsistent practises.
16:26 kados thd: it's not the job of the ILS to fix cataloging inconsistancies
16:26 thd: my goal is to allow complex cataloging
16:26 thd: not to intuit it :-)
16:27 thd kados: Perl is for making easy jobs easy and hard jobs possible
16:27 kados thd: so let's stick to the here and now
16:27 thd: that's outside the scope of 3.0
16:27 thd: I have to draw the line somewhere :-_
16:27 thd: :-)
16:27 thd back to my example where the fun is just starting
16:27 kados thd: so far, it's just a simple hierarchy
16:28 thd: nothing particularly complex about it
16:28 akn hi guys, I'm back for more help; we had the z3950 daemon running about 2 months ago, now we can't seem to get it.  The processz3950queue doesn't give any output...seems to be waiting like tail -f (?)
16:28 kados thd: you have a table with a key on recordID
16:28 akn: there is a config file in that directory, did you edit it?
16:28 thd kados: this is largely a question of nomenclature so the documentation does not confuse the terms.
16:29 kados akn: and you should not be starting processqueue from the command line
16:29 akn kados: no
16:29 kados: how do you start it?
16:29 kados akn: use z3950-daemon-launch.sh
16:30 akn kados: we did, with no visible results
16:30 thd kados: so we started with one record describing 3 copies
16:30 kados akn: if you're running it on the command line
16:30 akn: use z3950-daemon-shell.sh
16:31 akn: if from system startup use z3950-daemon-launch.sh
16:31 akn: and you need to edit the config file and add your local settings
16:32 thd: right, and it seems like a simple hierarchy to me
16:32 thd: with child-parent relationships
16:32 thd: nothing else
16:32 thd: sibling relationships can easily be derived
16:32 thd akn: also it needs to be started as root with the Koha environment variables set and exported.
16:34 kados: yes for nomenclature distinction though consider where we started with one record and then add subsidiary records.
16:34 kados thd: so here's our table:
16:34 holdings_hierarchy
16:34 columns:
16:34 itemId
16:35 (a unique ID for every element within the hierarchy)
16:35 recordId
16:35 (the 001 field in MARC, or another identifier in another record type)
16:35 lft
16:35 rgt
16:35 thd kados: copy2 in the parent record is still copy2 in that record but each volume covering one year can have its own copy number in subsidiary records.
16:35 kados actually, that might get hard to manage with such a large set
16:37 thd: that's still just a simple tree
16:37 thd: the MARC holdings folks have just managed to hide that fact
16:37 thd: by making it needlessly wordy :-)
16:38 thd kados: the level at which copy numbering operates depends upon whatever bibliographic level it is contained within or linked to in the case of a separate holdings record.
16:38 kados: one further problem for my example
16:38 kados thd: it is still and yet nothing more than a simple tree :-)
16:39 akn thd: variables are set/exported; we did have it running before
16:39 thd kados: someone has the clever idea of rebinding copy1 with loose issues in boxes.
16:40 kados thd: no problem, you just restructure the parent-relationship for those records
16:40 thd: it's not nearly as complex as it sounds
16:41 joshn_454 thd: what evironment vars need to be exported?
16:41 for the z3950 daemon
16:42 kados joshn_454: KOHA_CONF
16:42 joshn_454: PERL5LIB
16:42 thd kados: The structure itself is merely branches in a hierarchy the problem is reading in and out of MARC for correct interpretation, particularly when either there is only human readable text or the machine readable elements are so difficult for the programmer to interpret.
16:43 joshn_454 kados: how does KOHA_CONF differ from the KohaConf setting in the options file
16:43 ?
16:43 kados thd:  the first thing we need to do is create a framework for the hierarchy that works for managing holdings data
16:43 thd: ie a table :-)
16:44 joshn_454: KOHA_CONF is wherever your koha.conf file lives
16:44 joshn_454 okay
16:44 kados joshn_454: PERL5LIB should include the path to your C4 directory
16:45 thd joshn_454: maybe /etc/koha.conf for KohaConf
16:46 joshn_454 thd: right, that's what it's set to in the options file
16:47 thd joshn_454: Did you have it working previously?
16:47 kados joshn_454: notice I didn't mention KOHA_CONF :-)
16:47 thd: here's a table design that should work:
16:47 CREATE TABLE holdings_hierarchy (
16:47        itemID, --unique
16:47        recordID, -- this is the 001 in MARC
16:47        parentID, -- the parent node for this node
16:47        level,  -- level in the tree that this element is at
16:48        status, -- we need to decide how to handle this element in Koha
16:48 status is a bit limited
16:48 joshn_454 kados: yes, it was working before
16:49 kados joshn_454: did you upgrade or something? what has changed?
16:49 thd joshn_454: had you changed any config files?
16:49 joshn_454 I'm not aware that anything's changed :-/
16:51 thd joshn_454: Test that root has the koha environment variables set.
16:51 kados joshn_454: you sure it's not already running?
16:52 thd joshn_454: echo $KOHA_CONF
16:52 kados ps aux |grep z3950
16:52 thd joshn_454: echo $PERL5LIB
16:53 kados thd: what do you think about that hierarchy chart
16:53 thd joshn_454: both commands as root
16:53 kados thd: the status would be used to determin how Koha manages that node
16:53 joshn_454 I copied the C4 directory into perl's vendor directory; do I still need the $PER5LIB?
16:54 kados joshn_454: no
16:54 joshn_454 k
16:54 kados joshn_454: but why you would do that is beyond me
16:54 joshn_454 I doubt $KOHA_CONF is set
16:54 bc I didn't know what I was doing
16:54 kados joshn_454: and will lead to confusing issues when you upgrade
16:54 joshn_454: delete it
16:54 joshn_454: then just do:
16:54 joshn_454 alright, I'll nuke it
16:54 kados export KOHA_CONF=/path/to/koha.conf
16:55 export PERL5LIB=$PERL5LIB:/path/to/real/C4
16:55 where the /path/to/real/C4 is the parent directory for C4
16:55 thd kados: The table needs more elements for tacking to MARC
16:55 kados thd: no, that's done with a mapping file
16:56 thd ok
16:56 kados thd: keep MARC out of my Koha :-)
16:57 now ... here's the real trick
16:57 thd : - )
16:57 kados what if we could come up with an ecoded scheme for representing the hierarchy
16:58 and put it somewhere in the MARC record
16:58 I need to do some more thinking about this
16:59 thd kados: Do you mean an encoding generated by a script?  Not something a cataloguer would be meant to edit?
16:59 joshn_454 okay, did that.  Now z3950-daemon-shell runs, but the search still doesn't work
16:59 kados thd: exactly
17:00 thd joshn_454: was search working previously?
17:00 kados joshn_454: what sources are you searching, are they set up correctly, and are you searching on something that is contained in them?
17:01 thd joshn_454: search for new titles each time but known to be contaned in the target database
17:04 joshn_454 didn't know about trying a new search every time
17:06 ah.  That works!  Tx!
17:10 thd joshn_454: if you search for the same ISBN Koha may have code that assumes you found that already.
17:11 joshn_454: Koha 3 will take the pain out of Z39,50 searching.
17:12 joshn_454 eta for koha 3?
17:12 thd Line 103 should not be as follows:
17:12 refresh => ($numberpending eq 0 ? 0
17:12 :"search.pl?bibid=$bibid&random=$random")
17:12 If you find that,with an '0' after the '?' change it to the following:
17:12 refresh => ($numberpending eq 0 ? ""
17:12 :"search.pl?bibid=$bibid&random=$random")
17:12 Now you should have empty quotes '""' instead of '0'.
17:15 joshn_454: See above for fix for line 103 for
17:15 /usr/local/koha/intranet/cgi-bin/z3950/search.pl or wherever you put
17:15 z3950/search.pl
17:16 joshn_454: maybe may but it will not be production stable for a few months later.
17:17 joshn_454: I expect to commit at least an external to Koha Z39.50 client for 2.X prior to that time.
17:23 kados: are you still with us?
17:31 kados: your holdings table needs both a holdings and a bibliographic record key in case they are not identical as with a separate MARC holdings record.
17:33 kados: you need holdings_recordID and bibliographic_recordID.
17:35 kados: If that is not too much MARC in your Koha ( :-D
17:40 kados: without more MARC in your Koha, where do the semantic identifiers go to signify what the item represents as distinct from who its parent is?
18:01 kados thd: I'm here now
18:02 thd kados: MARC has 3 types of representations for holdings
18:02 kados yep, and all three can easily be done in our forest model
18:02 thd kados: the first and oldest being textual data
18:03 kados: The other two being allied with machine reability but not necessarily
18:05 kados: There are captions such as 1999, June, July, etc. and 1999, issue no. 1, no. 2, etc.
18:06 kados at this point, I'm not really worried at all about import/export issues
18:06 that's something to do when we have a client that needs this complexity
18:06 thd kados: Then there are publication patterns that show the relationships without having common names like captions.
18:06 kados what I want to focus on is the framework for supporting the complexity
18:06 so that I can sell the system to one of the libraries that can afford me to build the import script :-)
18:07 thd: does that make sense to you? :-)
18:08 thd kados: you asked about whether you can look in one place in MARC consistently for the hierarchy did you not?
18:08 kados yes
18:08 i do have another question too
18:09 internally, what holdings data do we need indexes on?
18:09 I can think of two:
18:09 status
18:09 scope
18:09 status could be a simple binary 1/0 to determine visibility
18:10 scope refers to the physical location of the items
18:11 thd kados: If holdings are all contained within one bibliographic record and not separate records then you have no nice parent child 001 and 004 values to consult.
18:11 kados it's an integer that's relevant to a given library system's branch hierarchy
18:11 thd: it's not going to be posisble to relate holdings to one another in zebra
18:11 thd: unless there's something in zebra I don't know about
18:12 thd: we'll have to do queries to relate the data
18:12 thd: (ie, give me all the records where the 004 is X)
18:12 thd kados: paul has a more complex status value than just 0/1 for UNIMARC Koha although the extra values are currently only for human reading.
18:12 kados I see what you mean though
18:13 it would be nice to do some kind of subselect in zebra
18:13 thd kados: What do you mean by subselect exactly?
18:14 kados so you go: find me all the records with the phrase 'harry potter', and check all the related holdings records and tell me which ones are available
18:14 I assume that's what you mean right?
18:15 thd kados: That is what every system has to do some how.
18:15 kados but I think I see one of the problems
18:15 suppose we import a bunch of records from one of these big libraries
18:15 say 20% of the MARC records are JUST holdings records
18:15 we have a problem :-)
18:16 because even if we do pull them into our shiny new forest
18:16 they will still be in zebra
18:16 ie, they are still just MARC records
18:17 so we'll have holdings data in Koha in our forest
18:17 and we'll have holdings records in the MARC data in zebra
18:18 unless we delete all the holdings data in the MARC on import
18:18 thd kados: You have to pull the data out of the dark MARC forest and put it into  your superior shiny Koha representation forest.
18:18 kados in which case we'll have a bunch of strange-looking MARC records stranded in our zebra index :-)
18:18 devoid of any substance :-)
18:18 thd: right!
18:19 thd: and also be able to put it back into MARC on export
18:19 thd kados: Those records will be fine there Koha, needs to know how to update them as necessary.
18:20 and create new ones as needed so that MARC is available for sharing with the world that does not know how to read the shiny forest format.
18:22 kados is it fair to say that we only need to update the status of the root MARC record (bibliographic) if there are no copies or related copies available?
18:22 thd kados: When the shiny forest format becomes the lingua-franca for holdings data then you will have less need to share in MARC.
18:23 kados thd: I can't think of a reason we'd need to store the status of each and every item in the MARC in zebra
18:23 thd: I think we could just set a visibility flag
18:24 thd: that would be turned off when the system detects that all the items of are lost or deleted
18:24 thd: does that make sense?
18:24 thd kados: Why do you need any stausin MARC had we not settled that months ago?
18:25 kados there is a reason
18:26 thd kados: Or do you need one boolean to inform Zebra not to return a record for a search.
18:26 kados some librarians don't want MARC records to show up if their status is lost or deleted
18:26 ie, people don't want to find a record that has no items attached to it
18:26 (well, that obviously doesn't apply to electronic resources)
18:27 you just need to tack on an extra 'and visible=1' to your CQL query
18:27 thd kados: There are also secret collections and material at many libraries.
18:27 kados to only retrieve records for items that the library wants patrons to see
18:29 thd: are their any standards for 'status'?
18:29 thd: that you know of?
18:30 thd kados: do you not need levels of privileges as to what is searched and not.  Not for the simplest cases but for the special collections that require privileged access to search.
18:30 kados hmmm
18:30 good point
18:30 some kind of access control would be nice
18:30 thd kados: There are MARC fields for restrictions and visibly already.
18:31 kados how are they structured? anything useful?
18:31 thd kados: I expect actually knowing what is in the collection or not is a big issue at many corporate libraries.
18:32 kados: Although the public library in the city where I grew up did not want just anyone even to know about the older material they had because of theft problems.
18:33 kados that is a good point
18:34 thd kados: 506 - RESTRICTIONS ON ACCESS NOTE (R) is one.
18:38 kados: Somewhat related are MARC fields designed to be hidden from patrons showing who donated the money or whatever.
18:39 kados thd: 506 is anything but machine readable
18:39 thd kados: Fields that may or may not be public depending upon an indicator.
18:39 kados thd: I can't believe they even bother calling it machine readable cataloging
18:40 thd kados: you can fill it with only authorised values and its repeatable.
18:41 kados here's the example they give:
18:41 506 Closed for 30 years; ‡d Federal government employees with a need to know.
18:41 that looks like free-text to me :-)
18:41 thd kados: you can make it machine readable or at least additional repeatable fields to what may already exist.
18:41 kados we could do the authorized values thing, but this is getting absurd
18:42 MARC really must die
18:42 I'll be back in an hour or so
18:43 thd kados: Where MARC has not defined values you can define your own in conformity with the standard.
18:43 kados: See you in an hour or so.
20:41 now we are both back kados
20:41 kados yep
20:41 I just read through OCLC's descriptions for the 506
20:42 thd kados: does OCLC do something special with them?
20:42 kados http://www.oclc.org/bibformats/en/5xx/506.shtm
20:42 thd s/them/it/
20:43 kados is OCLC's goal to become the world's largest ILS?
20:43 thd kados: OCLC has sometimes added their own extensions to MARC using unassigned subfields
20:43 kados: They are already ;)
20:43 kados hehe
20:44 I think we've got a solid idea in mind for how to represent the relationships
20:44 thd kados: at least they have fewer extensions to MARC that are outside the standard now
20:45 kados a table with the structure of an arbitrary forest using a combo of nested sets and adjacency lists will do the trick nicely
20:45 now ... I'm interested in figuring out how to represent the "entities"
20:45 and by 'entities' I mean all the components (levels) of a MARC record
20:45 thd kados: what is an adjacency list
20:45 ?
20:46 kados thd: what I showed earlier
20:46 that table was a simple adjacency list
20:46 it's a table that has:
20:46 nodeId
20:46 parentId
20:46 thd for the nearest adjacent node
20:47 kados anyway ... we need to come up with a way to represent all the information we'll need to know about one of the entities
20:47 thd kados: what aspect of the levels do you mean?
20:47 kados if we were designing a hierarhical tree based on the sctucture of a corporation
20:48 thd kados: all the semantic values for volume etc.?
20:48 kados what we have now is a way to represent the relationships (CEO, president, vice president, etc)
20:48 what we need is a way to represent the 'entities' that fill those roles
20:49 ie, name, address, salary, etc.
20:49 alas, we're not designing such a tree :(
20:49 so ... what kinds of information do we need to know about our entities?
20:50 I suppose things like volume probably weigh in
20:51 thd kados: we can fill all the standard factors but users need to be able to add others that we can never enumerate in advance
20:52 s/users/script needs to read from some unknown data/
20:55 kados yep
20:55 thd kados: we have enumeration representing various periods
20:57 kados we need a framework whereby we can build 'groups' of holdings information
20:57 and then attach those groups to levels in the holdings relationship hierarchy
20:59 we also need to be able to attach the group to specific holdings records
20:59 so we need a 'groupId' in our relationship hierarchy
21:01 if it's defined it takes precidence over the group assigned by level
21:02 thd kados: how do groups differ from hierarchy levels themselves?
21:02 your last sentence confuses me.
21:02 kados ok ...
21:02 so you have the representation of the relationship
21:03 but that's different than information about each element in the hierarchy (each node)
21:03 for instance
21:03 you want to be able to move copies from one bib record to another
21:03 if the information and relationship are seperate you only need to update one field
21:04 to make that transformation
21:04 the entity itself will contain things like:
21:04 status
21:04 barcode
21:05 thd kados: what chris refers to as groups in non-MARC Koha that MARC Koha killed.
21:05 kados item-level call number
21:05 maybe, I'm not too familiar with how that works
21:06 from what it sounds like the only reason MARC killed the groups
21:06 was because the default framework in Koha wasn't set up completely
21:06 it sounds like we could have easily defined another level within the record
21:06 thd kados: you could reassign items to groups for various purposes.
21:07 kados yep
21:09 thd groups is not the table name but chris calls it that for understanding how it was envisioned
21:10 Unfortunately, a fixed set of bibliographic elements were tied to groups so MARC Koha killed them.
21:11 s/killed them/killed their flexible use/
21:12 kados: my revised holdings doc provided a way to fix this in 2.X
21:14 kados: you saw that suggested change before with virtual libraries.  Very convoluted hack on what was currently in MARC Koha.
21:15 kados: continue with your example. I think I interrupted you.
21:17 kados: what information would be in the group table?
21:21 kados: Is the group table the entity table?
21:24 kados: are you there?
21:25 kados yep
21:25 I'm unclear
21:26 I need to talk to a real database designer :0(
21:26 thd kados: do you mean chris?
21:27 kados yea, just sent him an email
21:28 thd kados: About what are you unclear?
21:29 kados: you were the one describing :)
21:31 kados well, I begin to see the problem
21:31 and I have a notion that there is a solution :-)
21:31 but I'm not sure if I've landed on it
21:31 thd chris fulfilled his contract and is now off enjoying his summer weekend in NZ.
21:33 kados: There is certainly a solution much better than what MARC provides.
21:35 kados: performance is related to the number of joins required to fulfil a query.
21:39 kados: there may be multiple solutions that function.  You need the one that scales best for arbitrarily large data sets where the data is not nice and regular the way it never is in the real world.
21:46 kados: I have the answers for you.
21:46 kados: Or at least the bibliography that I had compiled.
21:46 SQL TREE REFERENCES
21:46 Celko, Joe. Chapter 28 Adjacency List Model of Trees in SQL ; Chapter 29
21:46 Nested Set Model of Trees in SQL : Joe Celko's SQL for smarties : advanced
21:46 SQL programming. 2nd ed. San Francisco : Morgan Kaufmann, 2000.
21:46 Celko, Joe. Joe Celko's Trees and hierarchies in SQL for smarties. San
21:46 Francisco : Morgan Kaufmann, 2004.
21:46 Kondreddi, Narayana Vyas. Working with hierarchical data in SQL Server
21:46 databases : Narayana Vyas Kondreddi's home page. (Aug. 12, 2002)
21:46 http://vyaskn.tripod.com/hiera[…]ver_databases.htm .
22:03 kados thd: I've got the Celko books
22:04 thd kados: I am too poor to have them now.
22:04 kados :-)
22:06 thd kados: I won big prizes for book buying from O'Reilly and then could not get my employers to pay me after that.
22:08 kados: At least you have all the right references.  I had divested myself of SQL books a few years ago when I realised that SQL was the wrong model for bibliographic databases.
22:10 divested means I sold them in my bookshop and saw little reason to retain them at that time.  Yet I never had the Celko books.  I had some unstudied good books by Date.
22:11 pez hello
22:11 thd hello pez
22:12 hello jOse
22:12 j0se : )
06:30 osmoze hello

← Previous day | Today | Next day → | Search | Index

koha1