IRC log for #koha, 2005-06-11

← Previous day | Today | Next day → | Search | Index

All times shown according to UTC.

Time	Nick	Message
15:01	rach	it's halloween?
15:02	owen	No, I checked.
15:02		I haven't checked for the possibility of a mummy's curse, though.
15:10	kados	hehe
15:18	rach	:-)
15:24		hi
15:24	sanspach	hello
15:28	rach	you guys have had a busy night :-)
15:28		well - day for you :-)
15:28	sanspach	yeah; still nothing solved, though :(
15:30	rach	you worked out the ^m tho
15:30		they are windows line breaks
15:30		end of line markers
15:30	sanspach	yeah, but still not certain which files are affected by it
15:31		and not exactly clear why some of the files that don't have them still don't work
15:31	rach	and if you change one record to get rid of them that doesn't help - so make a 1 record clean file?
15:32		but I see gavin tried that
15:33		so gavin didn't manage to get it in?
15:34	gavin	hi
15:34	rach	hi
15:34	gavin	the stuff was inserting for me at the end
15:36		i took one of sanspach's files which he emailed me (small.sample2.mrc) and substituted one delimiter for another after which it worked
15:36	rach	ah cool
15:36	gavin	then a newer file failed due to having some wierd win32 linebreaks stuck in the middle
15:36		no idea why they're there
15:36	rach	yep you'll have to take them out too, in the same sort of way
15:36		the magic of windows :-)
15:37	gavin	I haven't seen kados big file so I don't know what is wrong with what he got
15:37	sanspach	it is probably messed up in exactly the same way
15:37		it seems that MARC::Record doesn't strip the trailing ^M from the leader field when it re-writes it
15:38		or maybe I messed it up; I'll have to check
15:38	gavin	yes if i remove the ^M out that one works too
15:38		they ^Ms are all over the middle of records
15:38		it almost looks like an editor wrapped them or something
15:39	sanspach	they're all separate lines to begin with ("flat" format)
15:39		but when MARC::Record writes them out, I figured all the formatting would be fixed
15:39	gavin	not any of the ones i've seen
15:39	rach	you wish :-)
15:40	gavin	do you mean marc format should have linebreaks? none that I've seen have them
15:40	sanspach	no, no just for me
15:40	gavin	but i know little or nothing about marc
15:41	sanspach	I get the data out of our system db (Oracle, but same for mysql) as separate lines
15:41	gavin	i see, and you patch them up together?
15:41	sanspach	then I put everything back together and have MARC::Record create true marc format out of them
15:42	gavin	Oracle. that's an expensive library system!
15:42	sanspach	not for a univ. that has a site license already (!)
15:43		but yes, actually, Sirsi's Unicorn product isn't the cheapest out there
15:43	gavin	universities are indeed wonderful places
15:43	rach	ah well, at least it sounds like you know how to work on the data now
15:44	gavin	sanspach: what do you think we need to do with kados data?
15:45	sanspach	rm * and start over
15:45	gavin	not fixable?
15:45	sanspach	I've lost track of what the problems might be.
15:45		if it is just ^M we could strip those
15:46		if it is subfield delimiters too, we could do that
15:46	gavin	as far as I can tell it boils down to ^M and possibly delimiter substitution which would be very quick
15:46		rather than go through the pain of downloading 2GB again
15:48	sanspach	problem is, I think the delimiter that's wrong is used elsewhere in the data, which means no global replace
15:48		I think the data's got to be processed again
15:48	gavin	ah.
15:49		in that case I guess we'd better get the recreation process moving
15:50		would it help if we rehearsed on a small data set?
15:50	sanspach	definitely!
15:51	gavin	well if you want to give it a go and send me some stuff I'll try it out
15:51		then we can organise getting the 2GB batch off you
15:52		i have a good amount of bandwidth in my university which I can use for that
15:55	sanspach	OK, how should I get you the test files? I don't think putting them on my windows box and then
15:55		sending them through email is good ?!
15:56	gavin	you were able to put it on a web server before
15:56		if you bzip it you, windows will just treat it as a blob and it should be safe
15:56	sanspach	I'll work on that
15:56	gavin	so whatever works
15:58	sanspach	OK, same place: two files--one with 2 records, one with 100
16:01	gavin	those seem fine to me
16:04	sanspach	want to try 10K ?
16:05	gavin	yeah if you like. whatever size
16:05		but start thinking about bzipping it
16:05		it'll save both of us time and bandwith
16:06		width..
16:06	sanspach	gzip?
16:06	gavin	yeah, that's fine either, bzip2 just gets a greater compression (although it takes more cpu time)
16:07		if we step up to 2gb that'll make a whale of a difference
16:08	sanspach	don't seem to find bzip/bzip2 so I'll have to use gzip
16:08	gavin	n prob
19:23	kados	well that's a trick ;-)
19:24	chris	whats that then?
22:07	sanspach	kados: problems?
22:14	kados	sanspach: you still around?
22:14	sanspach	yeah
22:14	kados	sanspach: What's the deal with the latest conversion?
22:14		(looks like the process stopped)
22:14	sanspach	looks like the script stopped executing; I got disconnected a couple times, but I thought it would keep going
22:15		it was only about 1/4 done
22:15	kados	hmmm, guess not ...
22:15		I can start it on my end -- sound good?
22:15	sanspach	I removed the partial files
22:15		I had it running on my machine and it has finished
22:15	kados	sweet
22:15	sanspach	I'm bzip2'ing it now
22:15	kados	great
22:17	sanspach	as soon as it is done I'll start it transferring, but then I'm going to bed
22:17	kados	that's cool
22:18		shoot me an email with the size and I'll start indexing when it's finished uploading
22:18	sanspach	will do
22:32	Genji	kados: tried my search options sidebar?
02:31	paul	salut hdl
02:39	hdl	salut paul

← Previous day | Today | Next day → | Search | Index