IRC log for #koha, 2005-06-08

← Previous day | Today | Next day → | Search | Index

All times shown according to UTC.

Time Nick Message
12:30 slef anyone like that email?
12:30 the summary one
13:51 pate yeah, i did
14:05 slef I wonder if I can do those weekly :-/
14:55 paul slef : would be great.
14:57 kados slef: thanks for the summary!
14:57 slef: it's something I've been meaning to do but didn't get around to
15:12 sweet ... I've got a zap web-based Z39.50 server working:
15:12 s/server/client/
15:12 http://liblime.com/zap/try.html
15:13 Still having probs with that large dataset so the search on the localhost doesn't work
15:13 but you can search NPLs data using the following setting:
15:13 66.213.78.76:9999/VOYAGER
15:13 (under 'server')
15:14 I guess you have to use USMARC to get results
15:14 huh ... maybe not
15:15 owen SUTRS works, but not HTML
15:15 kados right
15:15 heh xml does though
15:15 owen What's SUTRS?
15:15 Couldn't be much faster
15:15 kados it's a text based record format
15:16 we'll see how it fares on the big dataset as soon as I resolve my path problems
15:20 sanspach something odd with charset translation though--standard european diacritics are displaying as Chinese characters!
15:26 kados here's an advanced query page:
15:26 http://liblime.com/zap/advanced.html
15:26 (still just a proof of concept)
15:26 but here you can request from multiple sources and get results back
15:27 something like this with an integrated MARC editor will probably be a good solution for catalogers
15:27 sanspach: could you give me an example?
15:28 sanspach searching es33.uits.indiana.edu:2200/Unicorn for author=durrenmatt retrieves great examples
15:28 available via LC's gateway under Indiana University for useful side-by-side compare
15:28 kados right
15:29 interesting
15:29 they are displaying as ? for me
15:30 I'll have to figure out what to do with different charsets
15:30 I think utf8 all the way for display
15:30 sanspach I got mix of Chinese and the square boxes (which usually means Unicode char. not displayable w/installed fonts)
15:30 kados right
15:31 sanspach odd the way the chars are corrupted, though--problem for both composite character *and* next one
15:33 instead of D!rrenmatt for D\:urrenmatt it is D!!renmatt (where ! is bad char)
15:33 and M!!chen instead of M!nchen
15:33 kados right
15:33 not sure why ... I'll have to look into it
15:35 slef u-umlaut is 2-bytes in utf8 IIRC
15:36 oh wait
16:48 owen hi rach
16:48 rach hi
16:54 owen bye rach ;)
19:36 kados http://olinks.sourceforge.net/
19:36 http://www.theresearcher.ca/index.html
19:37 it'd be nice to have openurl, and link resolution integrated into Koha
19:38 chris around?
19:39 hey owen
19:39 owen hey
19:39 kados what's up?
19:39 chris yep, pretty busy though, whats up?
19:39 kados http://liblime.com/zap/advanced.html
19:39 that's the advanced page
19:40 here's the simple:
19:40 http://liblime.com/zap/try.html
19:40 zap is a web-based z39.50 client
19:40 chris right
19:40 thats fast
19:40 unfortunately now we need to slow it down some
19:41 by adding other stuff
19:41 kados right
19:41 chris but it still should be faster overall i think
19:41 kados yep
19:41 chris (ie we want to be able to search by branch, only return itmes that arent lost to the opac etc)
19:42 kados actually, searching by branch could just be a seperate index
19:42 chris hhmmm
19:42 u want to be talking to the real db for item stuff
19:42 kados and returning non-lost items is pretty easy too since we do the items query after we do thte marc query
19:42 chris it changes a lot
19:42 kados yep
19:42 but that part of the search (now) is really fast
19:42 chris yep
19:42 kados it's the initial marc_word that's so slow
19:42 chris so you get a bunch of biblionumbers
19:43 and then return only the ones with items that match our criteria
19:43 kados we shouldn't even need to change our item-query section at all
19:43 chris shouldnt be to hard to do
19:43 kados it already does the stuff we need it to do
19:43 chris true
19:43 kados we just need a yaz wrapper to grab the biblionumbers
19:43 chris yep
19:43 kados I'm gonna try to get to that tonight
19:43 chris Net::z3950
19:43 kados yep
19:44 well ... dinner first ;-)
19:44 chris take a look at teh z3950 client
19:44 kados ok
19:44 chris its pretty much the base for it
19:44 kados i'm gonna really look closely at zap as well
19:44 chris we dont need to do the daemonising etc either
19:46 i now have an xml file of the bibliographical data for the new library we are doing an install for
19:46 thanks to good IBM documentation and a smart local ex system librarian
19:46 kados sweet
19:48 chris <ROOT>                                                                                              
19:48 <BIB _ID = "15009">                                                                                
19:48  <TITLE_MV TITLE = "Behind the tattooed face"/>                                                    
19:48  <AUTHOR_MV AUTHOR = "BAKER, Heretaunga Pat"/>                                                    
19:48  <PUBLISHER_MV/>                                                                                  
19:48  <BARCODE2_MV BARCODE2 = "31506000153634"/>
19:48 it doesnt get much easier to parse than that
19:48 kados hehe
19:48 dynix exports in xml eh?
19:48 chris nope
19:49 but universe the underlying database can be made to
19:49 kados ahh
19:49 I'm surprised it doesn't export to MARC
19:49 chris it does
19:49 if you pay someone to set it up for you
19:50 kados ahh ... right ... that's what spydus does
19:50 it's broken ... you pay them $2K and they change a line in the code somewhere and whalla it works ;-)
19:50 chris yeah
19:50 crippleware
19:50 kados yep
19:51 chris you can only get away with that in software
19:51 kados it's pretty sad how many librarians don't realize it's setup that way
19:51 it's their data but they can't get to it
19:51 chris exactly
19:53 slef MuggingWare
02:16 hdl hi
02:18 osmoze hi hdl :)
03:47 chris evening
03:47 paul chris is not sleeping yet ?
03:48 chris its only 8.48pm
03:48 im watching some tv before bed
03:48 paul right. We are in summer time. So there is only 10hours between us.
03:48 (GMT+12 vs GMT+2)
03:48 chris right
03:49 dynix -> xml -> koha
03:50 there seems to be plenty of koha work on around the world at the moment
03:50 its good to see
03:51 paul some code from argentina & my hapiness would be complete !
03:51 chris :)
04:03 slef new debian 3.1 out now
04:03 chris yep, saw that :)
04:04 gonna have to schedule a time to do an upgrade for our servers, .. and reinstall a bunch of perl modules
04:04 heh
06:07 kados paul: what does the dictionary search do?
06:07 paul good afternoon ;-)
06:24 slef about?
06:25 here's a planet installation that seems able to handle my html: http://curtis.med.yale.edu/code4lib/
07:03 hdl kados : dictionary search allows ppl to search among all authorities even rejected forms.
07:04 paul the search is on authorities & on existing values in biblio titles / authors / subjects.
07:04 should be an help for user not knowing exactly how to write what he is looking for
07:05 (what is the surname of Hugo, the author i'm looking for... let me see what exists in this library about hugo...)
07:05 (s/surname/firstname/)
07:06 kados right
07:06 thanks
07:06 paul note that it searchs in 2 parts : authorities and existing values.
07:07 so libraries without authorities should be happy as well as others.
07:07 kados can't we import authorities from the library of congress for free (or the bibliotech de france)?
07:08 (so authorities would come 'pre-loaded' in Koha?
07:08 (or are authorities usually defined by the library)
07:08 paul there are possibilities to do such things.
07:08 both in fact ;-)
07:08 kados (I don't understand the process since NPL doesn't use authorities)
07:08 ahh
07:08 paul some very specialised libraries have a specific authority list. like CMI (mathematics, that uses AMS thesaurus)
07:09 most libraries uses a common authority list.
07:09 In france, you can get authorities from BNF, Rameau.
07:09 kados it would be really nice if we preloaded some standard 'common' authoritiy list and had it auto-update
07:09 paul yes but no.
07:09 kados (not nice for everyone ;-))
07:10 paul even for a library that uses Rameau, the complete rameau thesaurus is unuseable : 420 000 entries just for personal names.
07:11 but let's ignore personal names, that are not heavily used.
07:11 let speak about subject.
07:11 even there, a thesaurus like Rameau is far far too wide for a public library.
07:11 and not specialized enough for a specialized library.
07:12 so, here in France, everybody want Rameau, but nobody uses it ;-) except when you get your biblio from BNF too
07:12 because in this case, you have biblios & authorities coming from the same source.
07:16 slef kados: I don't use planet itself and you should use 1999's xhtml by now.
07:16 :)
07:16 I will work on the underlying libraries for planet koha probably next week, though.
07:17 or maybe this week if stuff goes well
07:17 kados :-)
07:17 slef planet has 2 big problems: it doesn't handle incoming items cleanly (which leads to planet-spam whenever someone updates a blosxom install) and it's in python
07:18 kados gotcha
07:22 paul kados : did you see askjeeves extended features. Could be really nice in an OPAC
07:22 http://sp.ask.com/docs/mj/1.1/tour_intro.html
07:27 kados paul: that would be really easy
07:27 paul: I did a 'answers.com'
07:27 it's not finished yet (broken right now)
07:27 I'll look into askjeeves as well
07:35 paul, slef, hdl, Genji can I schedule the next bugsquashing meeting for Thursday June 9th at 14:00 UTC(GMT): in your time: http://tinyurl.com/dloaw
07:36 paul i won't be here.
07:36 (neither hdl)
07:36 kados paul: how about friday?
07:36 paul morning for me could be OK, but not evening.
07:36 (i'll be on Perl conference, see mail on koha-devel)
07:37 kados ahh ... right
07:37 how about next tuesday?
07:37 Genji anytime, anyday is okay.
07:37 kados in your time: http://tinyurl.com/asl8y
07:37 paul ok for me, after 15GMT, because i have to get my 1st son at school exit.
07:37 kados ok ... how about hdl?
07:38 Genji Err.. except 5:30pm to 10pm Wensday and friday. Japanese drumming.
07:38 hdl ok for me...
07:38 kados slef: ?
07:38 Genji i need a specific time
07:38 kados Genji: http://tinyurl.com/779cy
07:39 Genji Wha, that in 3 hours?
07:40 kados slef: Tuesday, June 14, 2005 at 15:00 UTC work for a Bugsquash meeting?
07:40 Genji: no... next week ;-)
07:40 Genji ah. good. ya. ill take it.
07:40 kados cool
07:46 slef kados: I'm out from 11am Friday until Sunday. Might be out Thursday lunchtime.
07:46 Tuesday 14 ok.
07:46 in fact looks fine
07:47 kados cool
07:47 paul: can you test this URL ? http://tinyurl.com/as66z
07:47 paul yes and ?
07:47 kados paul: it should return bugzilla with 68 bugs listed by order of importance
07:47 paul right.
07:47 kados great ...
07:47 thanks
07:48 paul some still assigned to steve tonnesen...
07:48 we should really update some bugzilla parameters...
07:49 how do you do to write tinyurl so quickly ?
07:50 kados paul: go to tinyurl.com ;-)
07:50 paul: I can do that now (chris gave me bugzilla superpowers ;-))
07:50 paul: I'll do it as soon as I"m done with this email
07:51 paul I know this service. just wanted to know if you had a specific tool to write them so quickly or if you had a tab always open on this page.
07:52 kados paul: tab ;-)
07:54 paul: who should have simple acquisitions, authentication, and parameters?
07:55 paul is it possible to have an owner by version ?
07:55 kados hmmm, I don't think so
07:55 paul so parameters => me
07:55 acquisition => chris
07:55 auth => dunno (probably not me)
07:55 (are there bugs about authentication in fact ?)
07:56 kados hehe good question
07:57 only three
07:57 for version 2.0
07:58 I'll put chris
07:58 paul: any new categories to add?
07:59 paul stats maybe ?
07:59 & not sure acquisition.simple still means something.
08:00 kados right
08:01 I'll ask katipo about that
08:01 stats => paul ?
08:01 paul hdl
08:01 kados and whats the description for this component?
08:01 paul statistics & reports
08:07 kados paul: any news from Emiliano? If not I'll write him another email ;-)
08:07 paul no news. hélas.
08:10 kados as a joke ;-)
08:11 paul be careful not to be ranked as spammer by his spamassassin...
08:11 (but could be fun anyway)
08:50 kados : i've parsed the 20 first bugs from your previous tinyurl
08:50 none of them can be fixed with more informations or strategic chat.
08:51 s/with/without/
08:51 & at least 2 or 3 should be fixed already
08:51 so, the meeting will be very interesting ;-)
08:53 kados paul: :-)
08:53 paul 2.2.3 is really close to be "releasable"
08:53 kados paul: hopefully we'll get feedback by next week
08:53 great!
08:53 paul what is your opinion about next bugsquashing : should I wait or not ?
08:53 i think that I should release 2.2.3
08:54 kados go ahead
08:54 IMO
08:54 paul (after a translating time)
08:54 kados realease early and often ;-)
08:54 paul ok, i announce 2.2.3 for next monday.
08:54 kados paul: but let owen catch up
08:54 paul (to devel & translate lists)
09:14 slef paul: I'll try to work the options out to avoid rel_2_2
09:15 paul slef : but it's interesting to know what happends on rel_2_2
09:15 (even if it should become only a little traffic)
09:17 slef hey, that log doesn't have any rel_2_2 as far as I can tell
09:37 paul 'morning owen.
09:37 owen Hi paul
09:38 paul will you stay around here ?
09:39 owen I'll be here all day
09:56 kados paul: it'd be nice if we could put together a "Koha Research Toolbar"
09:56 where you could store lots of different links -- search results being only one type
09:56 web urls (so a bookmarks section)
09:56 images
09:57 we could also integrate search APIs from other collections/ILSes
10:03 hdl kados : would it be developped in XUL ?
10:09 kados hdl: not sure ... I've never done a firefox extension
10:10 chris would probably have some suggestions
11:14 FrancoisL: howdy!
11:15 FrancoisL howdy, Joshua ! How's life ?
11:15 I'll be with the SAN tomorrow to list all functions we'll program - 'll keep you posted.
11:15 kados great!
11:15 I'll look forward to reading it ;-)
11:15 paul don't forget to add me to the mail.
11:15 kados things are well here
11:16 FrancoisL won't !
11:16 kados FrancoisL: have you seen the new searching 'proof of concept'?
11:16 http://liblime.com/zap/advanced.html
11:16 it's very fast
11:16 (only the Nelsonville checkbox is working atm)
11:17 FrancoisL Kool ! Is it through a z3950 client ?
11:17 kados FrancoisL: yep
11:17 FrancoisL kudos to kados...
11:21 Zap! is fazt!
11:21 kados FrancoisL: yep ... I'm working on a LibLime database that's 5 million items too
11:21 paul right.
11:21 kados FrancoisL: actually, it's the engine doing the Z39.50 magic that's fast.
11:22 FrancoisL kados: will it make it into next release ?
11:22 kados FrancoisL: server magic that is -- it's Zebra
11:22 FrancoisL: definitely
11:22 FrancoisL That's good news for the SANe.
11:22 kados FrancoisL: I'm hoping to have the 5Million records working in the next few days
11:22 FrancoisL: then we can really test the speed ;-)
11:22 FrancoisL: currently, NPL has 150K biblios
11:23 FrancoisL: so 5 Million is quite a few more ;-)
11:23 FrancoisL Can you wire the library of congress + French BnF ? Must be more than xxx mill. !
11:23 There's a public Z3950 server in BnF (hey, Paul !)
11:24 kados FrancoisL: I wonder if we could get access to the MARC records in LOC or BnF
11:24 FrancoisL: we could do a 'proof of concept' Koha install for them ;-)
11:24 paul in fact, francoisL, the idea is to use this server for searching in koha, instead of internal search.
11:25 FrancoisL Sounds like a good idea as it's faster and... does not rely on Marc_words (?)
11:25 kados searching currently is broken into two parts
11:25 FrancoisL: exactly
11:25 first, find biblio numbers
11:25 second, get item info for all items attached to those biblios
11:25 I plan to use Zebra for step 1
11:26 paul kados : the z3950 server of BNF is public, needs no authentification, but can be slow.
11:26 kados step 2 is actually quite fast
11:26 paul: can we download ALL the records?
11:26 paul as slow as LoC in fact ;-)
11:26 kados paul: is there some way to scrape it?
11:26 paul I don't think so.
11:26 FrancoisL Step 2 is staightforward MySQL...
11:26 kados FrancoisL: right
11:26 paul but, francoisL, could you request SAN to see if they have a CDROM ?
11:26 (as BNF also publish cd that are free for libraries)
11:26 kados so I'm working on a Yaz wrapper to do the initial query ... shoudl have something later this week
11:26 FrancoisL What CDROM ? Opale ?
11:26 paul (the problem being i'm not a library)
11:27 I think so.
11:27 kados paul: right
11:27 if we could get a "super large" data set xxx million records
11:27 that would be really great
11:27 FrancoisL OK I'll ask them tomorrow.
11:27 (and in... french too !)
11:27 kados (though 5 mill is quite large)
11:27 right ;-)
11:28 also note that the MARC file for 5 million records is 4.5 gig
11:28 FrancoisL I think size does not matter - provided it's more than 200 K or so...
11:28 paul note joshua, that it would be unimarc, not usmarc.
11:28 but that does not matter probably
11:28 kados paul: yea ... I didn't see an easy way for Zebra to parse that
11:28 paul: but I didn't look too hard
11:29 paul mmm... very bad news...
11:29 kados paul: it's easier than you think
11:29 download, install
11:29 export records to raw marc
11:29 move records file to test/unimarc/records
11:30 edit test/unimarc/zebra.cfg to add any options you need (default ones are fine though)
11:30 run test.sh
11:30 it will index records and automatically start the server
11:30 paul yes, but in unimarc, title is in 200$a, not in 245, so i'm wondering how zebra can handle that !
11:31 kados well I'm assuming that test/unimarc/ knows that ;-)
11:31 if not you can edit zebra.cfg to specify which things to index
11:31 syntax is pretty simple
11:31 there is no test/unimarc
11:32 so you will need to write a zebra.cfg for unimarc
11:32 there's a marcxml
11:32 rsusmarc
11:32 and usmarc
11:32 (also some others like dmoz ;-)
11:33 FrancoisL Let's say we want to display the title and author field in an USMARC record. The title is stored in field 245, subfield a and the author name is stored in field 100 where the subfields for surname is a and for first name it is h. This would look something like this in ZAP:
11:33      %%format usmarc
11:33      245
11:33      245/*
11:33      245/*/a "<B>Title:</B> $data"
11:33      100
11:33      100/*
11:33      100/*/a "<br><B>Author:</B> $data"
11:33      100/*/h ", $data"
11:33    
11:33 Given the values "The Ugly Ducklin", "Hans Christian " and "Andersen" for title, first name and surname, respectively, the output would be:
11:33      <B>Title:</B> The Ugly Ducklin
11:33      <B>Author:</B>Andersen, Hans Christian
11:33 The change should be painful but quick :)
11:34 paul I must leave now. Farewell
11:35 FrancoisL bye !
11:35 Leaving too - see ya all !
11:35 kados bye all

← Previous day | Today | Next day → | Search | Index

koha1