IRC log for #koha, 2006-04-23

All times shown according to UTC.

Time	Nick	Message
12:01	owen	I think my wireless router was built by monkeys.
12:01	paul	good luck with your jungle owen.
12:01		& bye all, dinner time for me
12:01	kados	bye paul
12:03	owen	kados: so WIPO reports that recent acquisitions is working from opac-search.pl but not opac-main.pl?
12:04		No--the other way around?
12:05	kados	working on opac-main
12:05		not on opac-search
12:05		wipoopac.liblime.com
12:06		question about your sysprefs
12:06		1. opacheader, Textarea, 30\|10, Enter HTML to be included as a custom
12:06		header in the OPAC
12:06		what's 30\|10?
12:06	owen	that's the colums/rows numbers for the system preference setup
12:07		"variable options" under "Koha internal"
12:07	kados	ahh ... never done one of those before
12:07	owen	Yeah, I think opaccredits needs to be updated that way too. I think right now it's just a single line entry
12:07	kados	right
12:08		owen: do you feel comfortable making the change to updatedatabase?
12:08		owen: all you need to do is add one block
12:08	owen	I can look at it, but I've never touched it before.
12:08	kados	owen: for an example, take a look at the Amazon sysprefs i created
12:08		it's not hard at all
12:09		bbiab
12:11	ToinS	bye all....!
16:01	kados	thd: you there?
16:02		thd: i thought I remembered you saying that the native alaskan scripts we had been working on recently were mapped on the LOC website
16:02		thd: but I can't find that chart anywyere
16:19	thd	kados: Well, at least the Cyrillic scripts are working. The 11th and 12th records had been in Cyrillic and not a native Alaskan language.
16:20		kados: I may not have native Alaskan in the Sample that I have been testing. I see something that where I cannot interpret the characters in vim and I know it is not English :)
16:24		kados: I had been having XML::Parser errors bringing processing to a halt on the first record encountered with Cyrillic previously.
16:35	kados	thd: aha!
16:35		thd: so then, the solution is to first conver to utf-8
16:36		thd: before doing anything else
16:36		thd: that can most easily be done like this:
16:36		my $uxml = $record->as_xml;
16:36		my $newrecord = MARC::Record::new_from_xml($uxml, 'UTF-8');
16:37	thd	kados: yes, that is the solution and if the native Alaskan records have problems they can be imported in MARC-8
16:38		kados: your XML solution would not work for me when it came to the 11th record.
16:38	kados	thd: really?
16:39		are you sure it was those instructions failing?
16:39		and not something else?
16:39		I'll do a test case on my machine
16:40	thd	kados: as soon as XML::Parser met those instructions for the 11th record everything died.
16:40	kados	hmmm
16:40		so that _does_ indicate that the encoding mapping wasn't working
16:40		but I need a test case to prove it to myself
16:40		and so I can show the error to Ed Summers
16:41	thd	kados: I know that you did not seem to be able to reproduce the XML::Parser error on your system.
16:41		kados: At least I worked around it for any system :)
16:45	kados	thd: my test fails on record 11 also
16:46	thd	kados: I do have one Cyrillic character in the 11th and 12th records for which I have no glyph in UTF-8 using whatever font is default for Koha.
16:46	kados	interesting
16:46	thd	kados: I suspect that display within Koha is merely a font issue.
16:48		kados: if I open the corresponding Z39.50 client pages using fonts set by my own client style sheet and UTF-8 conversion in YAZ then all looks well.
16:49	kados	interesting
16:49	thd	kados: Records 11 and 12 correspond to original records 15 and 16 in the full set.
16:49	kados	right
16:50	thd	kados: look at 15.html and 16.html saved by LWP .
16:52	kados	why don't they match up to 11.html and 12.html?
16:53	thd	kados: if the automated script had found every one of the original records then they would match.
16:54		kados: however the 11th record has an original record ID of 15 recorded in the extra values file.
16:55	kados	thd: the error I get is :
16:55		utf8 "\xEC" does not map to Unicode at /usr/local/lib/perl/5.8.4/Encode.pm line 167.
16:55		when I attempt to do the marc8->utf8 using the new_from_xml routine
16:57	thd	kados: I could not run the script far enough to see that error because I had the XML::Parser error stopping everything.
16:58	kados	thd: i just sent a message to Mike and Ed with the test case
16:58		thd: hopefully they'll have a chance to take a look soon
16:59	thd	kados: although I have not removed it yet, does saving the file in UTF-8 by opening it in that mode not seem to be a possible source of difficulty.
17:00	kados	is it opened in utf8 mode?
17:00		the outfile is, but the infile is non-specific
17:00	thd	kados: Perl should just be saving whatever is in the record converted already or not.
17:00		kados: I was referring to out-file.
17:00	kados	I don't think that should matter
17:01		I presume that the data is in utf-8 before it is saved to the filehandle
17:01	thd	kados: you mean that behaviour would be no different without opening outfile in UTF-8?
17:02	kados	it would be pretty simple to test ;-0
17:02	thd	kados: yes, the conversion is done before saving.
17:03		kados: Yet what is opening the outfile in a different format supposed to actually do?
17:03	kados	thd: just tested it, the behavior is the same
17:03		thd: it sets perl's utf8 flag
17:04	thd	kados: Do you mean that it stores meta information about encoding in a non-existent file meta-bit.
17:04		?
17:04	kados	thd: which ensure the data is written as utf-8 and not mangled by perl's internal
17:05		I have no clue how perl stores utf8 internally
17:05		but I do know there is a flag that marks data as utf8 or not
17:05		to properly write out utf-8 that flag must be set when opening a filehandle
17:06	thd	kados: Perl ought to be able to write any arbitrary encoding that I just invented today by writing whatever characters I tell it to write.
17:07		s/characters/bytes/
17:08	kados	right
17:09	thd	kados: Perl should not mangle anything unless I am counting string lengths in bytes and not characters but we are not counting string lengths of non-ASCII data in this code.
17:10	kados	it still fails on the 11th record when removing that specification
17:10		so it's a moot point IMO
17:11	thd	kados: Ok, I was just curious to be sure it was not doing extra encoding or something.
17:12		kados: I could imagine it encoding each byte in UTF-8 but I would certainly have expected to see different output from what I have were that the case :)
17:13	kados	right
17:13		yea, I think I tried that before, because I was worried it was double-encoding or something
17:18	thd	kados: a had felt perfectly awake when we were communicating this morning even though I should felt tired. Inability to see came over me within a couple of hours and I slept until a short time before you pinged me just now.
17:18	kados	ahh sleep :-)
17:19	thd	kados: It is strange how I can go from feeling perfect to not being able to function very rapidly, especially if I eat something :)
17:20	kados	heh
23:15	rach	although get the carbo crash when eat too many carbohydrates in one go

← Previous day | Today | Search | Index